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Abstract 

y—< Except for special classes of games, there is no systematic framework for analyzing the 

^ dynamical properties of multi-agent strategic interactions. Potential games are one such 

^ special but restrictive class of games that allow for tractable dynamic analysis. Intuitively, 

r^ games that are "close" to a potential game should share similar properties. In this paper, 

1—5 we formalize and develop this idea by quantifying to what extent the dynamic features of 

^H potential games extend to "near-potential" games. 

^ We study convergence of three commonly studied classes of adaptive dynamics: discrete- 

I — I time better/best response, logit response, and discrete-time fictitious play dynamics. For 

t^ better/best response dynamics, we focus on the evolution of the sequence of pure strategy 

>-; profiles and show that this sequence converges to a (pure) approximate equilibrium set, 

Y^ whose size is a function of the "distance" from a close potential game. We then study logit 

'— ' response dynamics parametrized by a smoothing parameter that determines the frequency 

,__( with which the best response strategy is played. Our analysis uses a Markov chain repre- 

^ sentation for the evolution of pure strategy profiles. We provide a characterization of the 

^yQ stationary distribution of this Markov chain in terms of the distance of the game from a 

cn close potential game and the corresponding potential function. We further show that the 

stochastically stable strategy profiles (defined as those that have positive probability un- 
der the stationary distribution in the limit as the smoothing parameter goes to 0) are pure 
approximate equilibria. Finally, we turn attention to fictitious play, and establish that in 
near-potential games, the sequence of empirical frequencies of player actions converges to a 
neighborhood of (mixed) equilibria of the game, where the size of the neighborhood increases 
^ with distance of the game to a potential game. Thus, our results suggest that games that 

are close to a potential game inherit the dynamical properties of potential games. Since a 
close potential game to a given game can be found by solving a convex optimization prob- 
lem, our approach also provides a systematic framework for studying convergence behavior 
of adaptive learning dynamics in arbitrary finite strategic form games. 
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1. Introduction 

The study of niulti-agent strategic interactions both in economics and engineering mainly 
rehes on the concept of Nash equilibrium. This raises the question whether Nash equilibrium 
makes approximately accurate predictions of the user behavior. One possible justification 
for Nash equilibrium is that it arises as the long run outcome of dynamical processes, in 
which less than fully rational players search for optimality over time. However, unless the 
game belongs to special (but restrictive) classes of games, such dynamics do not converge to 



a Nash equilibrium, and there is no systematic analysis of their limiting behavior (Fudenberg 



and Levine 1998 Jordan 1993 Shapley 1964). 



Potential games is a class of games for which many of the simple user dynamics, such 



as best response dynamics and fictitious play, converge to a Nash equilibrium (Fudenberg 



and Levine 1998, Monderer and Shapley 1996a|[b Sandholm 2010 Young 2004). Intuitively, 
dynamics in potential games and dynamics in games that are "close" (in terms of the payoffs 
of the players) to potential games should be related. Our goal in this paper is to make 
this intuition precise and provide a systematic framework for studying dynamics in finite 
strategic form games by exploiting their relation to close potential games. 

We start by illustrating via examples that this "continuity" property of limiting dynam- 
ics need not hold for arbitrary games, i.e., games that are close in terms of payoffs may 
have significantly different limiting behavior under simple user dynamics. Our first example 
focuses on better response dynamics in which at each step or strategy profile, a player (cho- 
sen consecutively or at random) updates its strategy unilaterally to one that yields a better 
payoff. [] 

Example 1.1. Consider two games with two players and payoffs given in Figure^ The 
entries of these tables indexed by row X and column Y show payoffs of the players when 
the first player uses strategy X and the second player uses strategy Y. Let < 6* ^ 1. 
Both games have a unique Nash equilibrium: {B,B) for Qi, and the mixed strategy profile 
{lA + \B,^^A + ^^B)forg,. 

We consider convergence of the sequence of pure strategy profiles generated by the better 
response dynamics. In Qi, the sequence converges to strategy profile {B,B). In Q2, the 
sequence does not converge (it can be shown that the sequence follows the better response 
cycle {A^A), {B,A), {B,B) and {A,B)). Thus, trajectories are not contained in any e- 
equilibrium set for e < 2. 

The second example considers fictitious play dynamics, where at each step, each player 
maintains an (independent) empirical frequency distribution of other player's strategies and 
plays a best response against it. 



^ Consider a game where players are not indifferent between their strategies at any strategy profile. 
Arbitrarily small payoff perturbations of this game lead to games which have the same better response 
structure as the original game. Hence, for a given game there may exist a close enough game such that the 
outcome of the better response dynamics in two games are identical. However, for payoff differences of given 
size it is always possible to find games with different better response properties as illustrated in Example 
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Figure 1: A small change in payoffs results in significantly different behavior for the pure 
strategy profiles generated by the better response dynamics. 

Example 1.2. Consider two games with two players and payoffs given in Figure^ Let 
< 9 <^ 1. It can he seen that Qi has multiple equilibria (including pure equilibria {A, A), 
{B, B) and {C, C)), whereas Q2 has a unique equilibrium given by the mixed strategy profile 
where both players assign 1/3 probability to each of its strategies. 
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Figure 2: A small change in payoffs results in significantly different behavior for the empirical 
frequencies generated by the fictitious play dynamics. 

We focus on the convergence of the sequence of empirical frequencies generated by the 
fictitious play dynamics (under the assumption that initial empirical frequency distribution 
assigns probability 1 to a pure strategy profile, and whenever players are indifferent between 
different strategies, they choose the lexicographically smaller one). In Qi, this sequence con- 
verges to a pure equilibrium starting from any pure strategy profile. In Q2, the sequence 



displays oscillations similar to those seen in the Shapley game (see Fudenberg and Levine 
(1998), Shapley ( 1964^ ). To see this, assume that the initial empirical frequency distribution 
assigns probability 1 to the strategy profile {A, A). Observe that since the underlying game 
is a symmetric game, empirical frequency distribution of each player will be identical at all 
steps. Starting from {A, A), both players update their strategy to C. After sufficiently many 
updates, the empirical frequency of A falls below 6/(1 + 6), and that of C exceeds 1/(1 + 6). 
Thus, the payoff specifications suggest that both players start using strategy B. Similarly, 
after empirical frequency of B exceeds 1/(1 + 6), and that of C falls below 6 /{I + 6), then 
both players start playing A. Observe that update to a new strategy takes place only when one 
of the strategies is being used with very high probability (recall that 6 <^ 1) and this feature 
of empirical frequencies is preserved throughout. For this reason the sequence of empirical 
frequencies does not converge to (1/3,1/3,1/3), the unique Nash equilibrium of Q2. 

In this paper, in contrast with the preceding examples, we will show that games that 
are close (in terms of payoffs of players) to potential games have similar limiting dynamics 
to those in potential games. In particular, many reasonable adaptive dynamics "converge" 
to an approximate equilibrium set, whose size is a function of the distance of the game 



to a close potential game. Our approach relies on using the potential function of a close 
potential game for the analysis of commonly studied update rulesjj We note that our results 
hold for arbitrary strategic form games, however our characterization of limiting behavior of 
dynamics is more informative for games that are close to potential games. We therefore focus 
our investigation to such games in this paper and refer to them as near-potential games. 

We start our analysis by introducing maximum pairwise difference, a measure of "close- 
ness" of games. Let p and q be two strategy profiles, which differ in the strategy of a 
single player, say player m. We refer to the change in the payoff of player m between these 
two strategy profiles, as the pairwise comparison of p and q. Intuitively, this quantity cap- 
tures how much player m can improve its utility by unilaterally deviating from strategy 
profile p to strategy profile q. For given games, the maximum pairwise difference is defined 
as the maximum difference between the pairwise comparisons of these games. Thus, the 
maximum pairwise difference captures how different two games are in terms of the utility 
improvements due to unilateral deviations. Since equilibria of games, and strategy updates 
in various update rules (such as better/best response dynamics) can be expressed in terms 
of unilateral deviations, maximum pairwise difference provides a measure of strategic sim- 
ilarities of games. We show that the closest potential game to a given game, in the sense 
of maximum pairwise difference, can be obtained by solving a convex optimization problem. 
This provides a systematic way of approximating a given game with a potential game that 



has a similar equilibrium set and dynamic properties, as illustrated in Example 1.3 



Example 1.3. Consider a two-player game Q, which is not a potential game, and the closest 
potential game to this game (in terms of maximum pairwise difference), Q , given in Figurel^ 
The maximum pairwise difference of these games is 2, since the utility improvements in these 
games due to unilateral deviations differ by at most 2 (For instance consider the deviation 
of the column player from {A, A) to {A,B). In Q this leads to a utility improvement of 6, 
whereas, in G the improvement amount is A). It can be seen that for both games {B, B) 
is the unique equilibrium. Moreover, trajectories of better response dynamics and empirical 
frequencies of fictitious play dynamics converge to this equilibrium in both games. 
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Figure 3: A game {Q) and a nearby potential game {Q) share similar equilibrium set and 
dynamic properties. 



We focus on three commonly studied user dynamics: discrete-time better/best response, 
logit response, and discrete-time fictitious play dynamics, and establish different notions 
of convergence for each. We first study better/best response dynamics. It is known that 



^Throughout the paper, we use the terms learning dynamics and update rules interchangeably. 



the sequence of pure strategy profiles, which we refer to as trajectories, generated by these 



update rules converge to pure Nash equilibria in potential games (Monderer and Shapley 



1996b, Young 2004). In near-potential games, a pure Nash equilibrium need not even exist. 



For this reason we focus on the notion oi pure approximate equilibria or e-equilibria, and show 
that in near-potential games trajectories of these update rules converge to a pure approximate 
equilibrium set. The size of this set only depends on the distance of the original game from 
a potential game, and is independent of the payoffs in the original game. 

We then focus on logit response dynamics. With this update rule, agents, when updating 
their strategies, choose their best responses with high probability, but also explore other 
strategies with a nonzero probability. Logit response induces a Markov chain on the set of 
pure strategy profiles. The stationary distribution of this Markov chain is used to explain 



the limiting behavior of this update rule (Alos-Ferrer and Netzer 2010, Blume 1993 1997 



Marden and Shamma|2008 Young|1993 ). In potential games, the stationary distribution can 
be expressed in closed form in terms of the potential function of the game. Additionally, the 
stochastically stable strategy profiles, i.e., the strategy profiles which have nonzero stationary 
distribution as the exploration probability goes to zero, are those that maximize the potential 
function ( Alos-Ferrer and NetzerpOlO Blume|[l997 Marden and Shamma||2008 ) . Exploiting 
their relation to close potential games, we obtain similar results for near-potential games: (i) 
we obtain an explicit characterization of the stationary distribution in terms of the distance 
of the game from a close potential game and the corresponding potential function, and 
(ii) we show that the stochastically stable strategy profiles are the strategy profiles that 
approximately maximize the potential of a close potential game, implying that they are pure 
approximate equilibria of the game. Our analysis relies on a novel perturbation result for 



Markov chains (see Theorem 5.1) which provides bounds on deviations from a stationary 



distribution when transition probabilities of a Markov chain are multiplicatively perturbed, 
and therefore may be of independent interest. 

A summary of our convergence results on better /best response and logit response dy- 
namics can be found in Table [U 



Update Rule 



Convergence Result 



Better/Best Re- 
sponse Dynamics 



(Theorem 4.1) Trajectories of dynamics converge to Xsh, i-e., the 
(5/i-equilibrium set of Q. 



Logit Response Dy- 
namics (with pa- 
rameter r) 



(Corollary 5.2) Stationary distribution /i,- of logit response dynam- 



ics is such that 
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Logit Response Dy- 
namics 



(Corollary 5.3) Stochastically stable strategy profiles of Q are 
(i) contained in S" = {p|0(p) > maxq0(q) — 4:6{h — 1)}, (ii) 4:6h- 
equilibria of Q. 



Table 1: Convergence properties of better /best response and logit response dynamics in 
near-potential games. Given a game Q, we use Q to denote a nearby potential game with 
potential function such that the distance (in terms of the maximum pairwise difference, 
defined in Section |2]) between the two games is S. We use the notation X^ to denote the 
e-equilibrium set of the original game, h to denote the number of strategy profiles, fir and fir 
to denote the stationary distributions of logit response dynamics in Q and Q, respectively. 



We finally analyze fictitious play dynamics in near-potential games. In potential games 
trajectories of fictitious play need not converge to a Nash equilibrium, but the empirical 



frequencies of the played strategies converge to a (mixed) Nash equilibrium (Monderer and 



Shapley 1996a, Shamma and Arslan 2004). In our analysis of fictitious play dynamics, 



we first show that in near-potential games if the empirical frequencies are outside some e- 
equilibrium set, then the potential of the close potential game (evaluated at the empirical 
frequency distribution) increases with each strategy update. Using this result we establish 
convergence of fictitious play dynamics to a set which can be characterized in terms of the 
e-equilibrium set of the game and the level sets of the potential function of a close potential 
game. This result suggests that in near-potential games, the empirical frequencies of fictitious 
play converge to a set of mixed strategies that (in the close potential game) have potential 
almost as large as the potential of Nash equilibria. Moreover, exploring the property that for 
small e, e-equilibria are contained in disjoint neighborhoods of equilibria, we strengthen our 
result and establish that if a game is sufficiently close to a potential game, then empirical 
frequencies of fictitious play dynamics converge to a small neighborhood of equilibria. This 
result recovers as a special case convergence of empirical frequencies to Nash equilibria in 
potential games. 

A summary of our results on convergence of fictitious play dynamics is given in Table [2j 



Update Rule 



Convergence Result 



Fictitious Play 



(Corollary 6.1) Empirical frequencies of dynamics converge to the set 
of mixed strategies with large enough potential: {x G Ylm A-E''"|0(x) > 
miny6A',f,0(y)} 



(Theorem 6.2) Assume that Q has finitely many equilibria. There exists 
some ^ > 0, and e > (which are functions of utilities of Q but not 
6) such that ii 6 < 6, then the empirical frequencies of fictitious play 
converge to 



Fictitious Play 



Xfcl 



< 



Af{M5)ML 



+ f{M6 + e), for some equilibrium x^ 



for any e such that e > e > 0, where / : M.^ — )• I 
semicontinuous function such that f{x) — )■ as x — ;■ 0. 



IS an upper 



Table 2: Convergence properties of fictitious play dynamics in near-potential games. We 
denote the number of players in the game by M, set of mixed strategies of player m by 
AE"^, and the Lipschitz constant of the mixed extension of (p by L. Rest of the notation is 
the same as in Table [Tj 



The framework provided in this paper enables us to study the limiting behavior of adap- 
tive user dynamics in arbitrary finite strategic form games. In particular, for a given game 
we can use the proposed convex optimization formulation to find a nearby potential game 
and use the distance between these games to obtain a quantitative characterization of the 
limiting approximate equilibrium set. The characterization this approach provides will be 
tighter if the original game is closer to a potential game. 



Related Literature: Potential games play an important role in game-theoretic analysis 
because of existence of pure strategy Nash equilibrium, and the stability (under various 
learning dynamics such as better/best response dynamics) of pure Nash equilibria in these 
games ( Fudenberg and Levine|[l998 , Monderer and Shapley 1996b Young|[2004 ). Because of 
these properties, potential games found applications in various control and resource allocation 



Shapley 1996b). 



problems (Arslan et al. 2007 Candogan et al. 2010a, Marden et al. 2009a Monderer and 



There is no systematic framework for analyzing the limiting behavior of many of the 



adaptive update rules in general games (Fudenberg and Levine 1998, Jordan 1993, Shapley 



1964). However, for potential games there is a long line of literature establishing convergence 



of natural adaptive dynamics such as better /best response dynamics (Monderer and Shapley 



1996b Young 2004), fictitious play (Hofbauer and Sandholm 2002, Marden et al. 2009b 



Monderer and Shapley '1996a Shamma and Arslan||2004) and logit response dynamics (Alos- 



Ferrer and Netzer||2010t |Blume||1993[ |1997| [Marden and Shamma||2008D . 

It was shown in recent work that a close potential game to a given game can be obtained 
by solving a convex optimization problem (see Candogan et al. (2011 2010b)). It was also 



proved that equilibria of a given game can be characterized by first approximating this game 
with a potential game, and then using the equilibrium properties of close potential games 



(Candogan et al. 2011; 2010b). This paper builds on this line of work to study dynamics in 



games by exploiting their relation to a close potential game. 

Paper Organization: The rest of the paper is organized as follows: We present the game 
theoretic preliminaries for our work in Section |2} In Section |3} we explain how a close 
potential game to a given game can be found, and discuss possible extensions of this approach. 
We present an analysis of better and best response dynamics in near-potential games in 
Section |4j In Section [5| we extend our analysis to logit response, and focus on the stationary 
distribution and stochastically stable stable states of logit response. We present the results 
on fictitious play, and its extensions in Section [6j We close in Section [7] with concluding 
remarks and future work. 

2. Preliminaries 

In this section, we present the game-theoretic background that is relevant to our work. 
Additionally, we introduce the closeness measure for games, which is used in the rest of the 
paper. 

2.1. Finite Strategic Form Games 

Our focus in this paper is on finite strategic form games. A (noncooperative) finite game 
in strategic form consists of: 

• A finite set of players, denoted byA^ = {l,...,M}. 

• Strategy spaces: A finite set of strategies (or actions) E"^, for every m & Ai. 



Utility functions: u"^ : YlkeM ^^ ~^ ^' -^^^ every m & M.. 



We denote a (strategic form) game instance by the tuple (A^, {-E'^jmeA^, {w^'jmeA^)) and 
the joint strategy space of this game instance hj E = YlmeM ^^- ^^ refer to a collection 
of strategies of all players as a strategy profile and denote it by p = (p^, . . . ,p^^) G E. The 
collection of strategies of all players but the mth one is denoted by p""*. 

The basic solution concept in a noncooperative game is that of a Nash Equilibrium (NE). 
A (pure) Nash equilibrium is a strategy profile from which no player can unilaterally deviate 
and improve its payoff. Formally, p is a Nash equilibrium if 

M'"(g™, p-™) - m"(p'", p-"^) < 0, 

for every q^ G E^ and m & M.. 

To address strategy profiles that are approximately a Nash equilibrium, we use the con- 
cept of e-equilibrium. A strategy profile p = (p^, . . . ,p^^) is an e-equilibrium (e > 0) if 

«"(?'", P""") - m'"(p", P^™) < e 

for every q"^ G E"^ and m & M.. We denote the set of e-equilibria in a game Q hy X^. Note 
that a Nash equilibrium is an e-equilibrium with e = 0. 

2.2. Potential Games 

We next describe a particular class of games that is central in this paper, the class of 



potential games (Monderer and Shapley 1996b) 



Definition 2.1 (Potential Game). A potential game is a noncooperative game for which 
there exists a function (p : E ^ M. satisfying 

for every m E M., p^^q^ g E'^ , p"'" G -E"™. The function cf) is referred to as a potential 
function of the game. 

This definition ensures that the change in the utility of a player who unilaterally deviates 
to a new strategy, coincides exactly with the corresponding change in the potential function. 
Extensions of this definition in which equation ([I]) holds when each utility function is mul- 
tiplied with a (possibly different) positive weight, or changes in utility and potential only 
agree in sign, give rise to weighted and ordinal potential games that share similar properties 
to potential games. We briefly discuss some of these extensions in Section [3j However, our 



main focus in this paper is on potential games in Definition |2.1 

Some properties that are specific to potential games are evident from the definition. For 
instance, it can be seen that unilateral deviations from a strategy profile that maximizes the 
potential function (weakly) decrease the utility of the deviating player. Hence, this strategy 
profile corresponds to a Nash equilibrium, and it follows that every potential game has a 
pure Nash equilibrium. 

Another important property of potential games, which will be used for characterizing 
the limiting behavior of dynamics in near-potential games, is that the total unilateral utility 
improvement around a "closed path" is equal to zero. Before we formally state this result, 
we first provide some necessary definitions, which are also used in Section |4] when we analyze 
better/best response dynamics in near-potential games. 



Definition 2.2 (Path - Closed Path - Improvement Path). A path is a collection of strategy 
profiles 7 = (po, . . . Pn) such that pj and pi+i dijfer in the strategy of exactly one player. A 
path is a closed path (or a cycle j if Po = Pn- A path is an improvement path ifu"^^{pi) > 
u"^'{pi-i) where rrii is the player who modifies its strategy when the strategy profile is updated 
from pi_i to Pi. 

The transition from strategy profile Pi_i to pj is referred to as step i of the path. The 
length of a path is equal to its number of steps, i.e., the length of the path 7 = (po, . . . , Pat) 
is N. We say that a closed path is simple if no strategy profile other than the first and the 
last strategy profiles is repeated along the path. For any path 7 = (po, . . . , Pn) let 1(7) 
represent the total utility improvement along the path, i.e., 

N 



I{^) = J2n^HP^)-^"''iP^-l) 



i=l 



where rrii is the index of the player that modifies its strategy in the ith step of the path. 
The following proposition provides a necessary and sufficient condition under which a given 
game is a potential game. 



Proposition 2.1 (Monderer and Shapley (1996b)). A game is a potential game if and only 



if 1(7) = for all simple closed paths 7. 

We conclude this section by formally defining the measure of "closeness" of games, used 
in the subsequent sections. 

Definition 2.3 (Maximum Pairwise Difference). Let Q and Q he two games with set of 
players Ai, set of strategy profiles E, and collections of utility functions {w^jmex o'^^ 
{'u^}m<^M respectively. The maximum pairwise difference (MPD) between these games is 
defined as 

d{g, g) = max I (u™(g", p"") - u'"(p™, p-"")) - (u"'{q"', p"") - u"(p'", p-"")) I . 

Note that the pairwise difference u"^{q"^, p^™") — u"^{p^, p~™) quantifies how much player 
m can improve its utility by unilaterally deviating from strategy profile (p™, p"™) to strategy 
profile (g™, p~™). Thus, the MPD captures how different two games are in terms of the utility 
improvements due to unilateral deviationsjj We refer to pairs of games with small MPD as 
close games, and games that have a small MPD to a potential game as near-potential games. 



3 



An alternative distance measure can be given by 



d2ig,g) =\J2 E ((""(r,?-™) -^^'"(p'",?-™)) - (^"(g™,?-™) -^"(p",?-™)))^ 

and this quantity corresponds to the 2-norm of the difference of Q and Q in terms of the utihty improvements 
due to unilateral deviations. Our analysis of the limiting behavior of dynamics relies on the maximum of 
such utility improvement differences between a game and a near-potential game. Thus, the measure in 



Definition 2.3 provides tighter bounds for our dynamics results, and hence is preferred in this paper. 



The MPD measures the closeness of games in terms of the difference of unilateral devia- 
tions, rather than the difference of their utility functions, i.e., quantities of the form 



{u^{q 



m ,-,— "i"! 



M™(p™,p-")) - ({i'"(g™,p-") -{i'"(p™,p-'"))| 



are used to identify close games, rather than quantities of the form |u'"(p™', p~™) — u"^{p^, p^^ 
This is because the difference in unilateral deviations provides a better characterization of 
the strategic similarities (equilibrium and dynamic properties) between two games than the 
difference in utility functions. This can be seen from the following example: Consider two 
games with utility functions {u"^} and {u"^ + 1}, i.e., in the second game players receive an 
additional payoff of 1 at all strategy profiles. It can be seen from the definition of Nash equi- 
librium that despite the difference of their utility functions, these two games share the same 
equilibrium set. Intuitively, since the additional payoff is obtained at all strategy profiles, it 
does not affect any of the strategic considerations in the game. While the utility differences 
between these games is nonzero, it can be seen that the MPD is equal to zero. Hence MPD 
identifies a strategic equivalence between these games. The recent work [Candogan et al. 



(2011) contains a formal treatment of strategic equivalence and its implications for strategic 



form games. 



3. Finding Near-Potential Games 

In this section, we present a framework for finding the closest potential game to a given 
game, where the distance between the games is measured in terms of MPD. We formulate 
the problem of identifying such a game as a convex optimization problem, and discuss the 
extensions of this approach. We note that a procedure for finding near-potential games can 



be found in Candogan et al. (2011) and Candogan et al. (2010b). In these works the distance 



between games is measured in terms of a 2-norm. In this section we illustrate how similar 
ideas can be used when the distance is measured in terms of MPD. 



It can be seen from Proposition |2.1| that a game is a potential game if and only if it 
satisfies hnear equalities. This suggests that the set of potential games is convexQ i.e., if 
g = {M, E, {m™}™) and G = {M, E, {u^}m) are potential games, then Q^ = {M, E, {au'^ + 
(1 — a)u'^}m), is also a potential game provided that a G [0, 1]. 

Assume that a game with utility functions {u"^}m is given. The closest potential game 
(in terms of MPD) to this game, with payoff functions {u"^}m, and potential function can 



A game is a weighted potential game if (IT]) in Definition 2.f is replaced by 



0(p'",p-™) 



r, P"") = w" (u™(p'", p"") - u"(g'", p""')) 



for some positive player-specific weights {w™}. If instead of holding with equality, the left and right hand 
sides of ([l|) on ly agree in sign, then the game is referred to as an ordinal potential game (Monderer and 
Shapleyl 1996b I . Despite the fact that weighted and ordinal potential games have similar desirable properties 
to potential games, their sets are nonconvex, and finding the closest weighted/ordinal potential game to a 
given game requires solving a nonconvex optimization problem (Candogan et al.||2010b). 



10 



m\ 



be obtained by solving the following optimization problem: 

min max I (w'^fg™, p"™) - w'^fp'", p"™)) 

si. 0(g™,p-"^)-0(p™,p-™) = M"(g™,p-")-M™(p'",p 
for all m G M, p e -E, g™ G -E™. 

Note that the difference (^'"(g'", p"™) - u™(p'^, p"")) - {u"'{q"',p-"') - M™(p'", p"™)) is 
linear in {u"^}m.- Thus, the objective function is the maximum of such linear functions, and 
hence is convex in {u"^}m- The constraints of this optimization problem guarantee that the 
game with payoff functions {u"^}m is a potential game with potential (p. Note that these 
constrains are linear. Therefore, it follows that (P) is a convex optimization problem that 
gives the closest potential game to a given game. 

Let Qi and Q2 be games with utility functions {«'"}„ and {w"^u"^}m respectively, where 
for all m G Ai, w"^ > 1 is a fixed weight. It can be seen that preferences of play- 
ers are identical in these two games, i.e., u"^{x"^,x~"^) — u'^{y"^,x~"^) > if and only if 
w;™m'"(x'",x-'^) - w"'m™(?/™,x-™) > for any m e M, y"^ e AE"" and x G Uf^^j^AE''. 
Thus, it follows that the equilibrium sets of these game are the same, and for many of the 
update rules in game^ such as better /best response dynamics, and fictitious play (but not 
logit response), the trajectories of dynamics in Qi and Q2 are identical. 

This observation suggests that it may also be of interest to find a close potential game 
to a "scaled version" of a given game. The following optimization formulation obtains such 
a potential game: 

min max lu;'" (w'^fg'", p~™) - u"'(p"', p-^) 

(P2 s.t. w"" > 1 for all m G 7W, 

0(g-, P-) - 0(r , p-™) = u'^icT. P-™) - ^'"(r , P-'"), 
for all m G M, p e E, g *" G E"". 

The solution of (P2) is a potential game with utility functions {«"*}„. Comparing (P) and 
(P2) it can be seen that (P2) obtains the closest potential gamaj (in terms of MPD) to the 
game with utility functions {w"^u"^}m- Since (P2) also minimizes the objective function over 
{w"^}, the solution also reveals the "scaling" of the original game, which makes it as close 
as possible to a potential game. 



^For formal definitions of these update rules see Sections R] pl ^ 

^Note that the solution of (P2) is not the closest weighted potential game to the original game. Such a 
game can be obtained by replacing the objective function by 

max I (^"(9", p-™) - u™(p'", p-")) - u;" (^"(9™, p"") - u^fp", p"™)) I . 

However, this objective function leads to a nonconvex optimization formulation due to the multiplication of 



the terms w"^ and u™, and the solution of this problem is different than that of (P2). See Candogan et al. 
([2010bj) for details. 
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In the rest of the paper, we do not discuss how a close potential game to a given game is 
obtained, but we just assume that a close potential game with potential is known and the 
MPD between this game and the original game is 6. We provide characterization results on 
limiting dynamics for a given game in terms of and 6. 

4. Better Response and Best Response Dynamics 

In this section, we consider better and best response dynamics, and study convergence 
properties of these update rules in near-potential games. All of the update rules considered in 
this section are discrete-time update rules, i.e., players are allowed to update their strategies 
at time instants t G Z+ = {1,2,...}. 

Best response dynamics is an update rule where at each time instant a player chooses its 
best response to other players' current strategy profile. In better response dynamics, on the 
other hand, players choose strategies that improve their payoffs, but these strategies need 
not be their best responses. Formal descriptions of these update rules are given below. 

Definition 4.1 (Better and Best Response Dynamics). At each time instant t G {1, 2, . . . }, 
a single player is chosen at random for updating its strategy, using a probability distribution 
with full support over the set of players. Let m be the player chosen at some time t, and let 
r E E denote the strategy profile that is used at time t — 1. 

1. Better response dynamics is the update process where player m does not modify its strat- 
egy if u"^{r) = maXgm ■u™'(g™, r~™), and otherwise it updates its strategy to a strategy 
in {q"^\u"^{q'^,r~"^) > ■u™(r)}, chosen uniformly at random. 

2. Best response dynamics is the update process where player m does not modify its strat- 
egy if u"^{r) = maxgm ■u™'(g™, r~™), and otherwise it updates its strategy to a strategy 
in argmaxgm ■u™(g'", r~™), chosen uniformly at random. 

For simplicity of the analysis, we assume here that users are chosen randomly to update 
their strategy. However, this assumption is not crucial for our results, and can be relaxed. 

We refer to strategies in argmax^m ■u™'(g™,r^'") as best responses of player m to r~™. We 
denote the strategy profile used at time t by pt, and we define the trajectory of the dynamics 
as the sequence of strategy profiles {pt}^o- I^ o^^ analysis, we assume that the trajectory 
is initialized at a strategy profile po G £■ at time and it evolves according to one of the 
update rules described above. 

The following theorem establishes that in finite games, better and best response dynamics 
converge to a set of e-equilibria, where the size of this set is characterized by the MPD to a 
close potential game. 

Theorem 4.1. Consider a game Q and let Q be a nearby potential game such that d{Q, Q) < 

5. Assume that best response or better response dynamics are used in Q, and denote the 
number of strategy profiles in these games by \E\ = h. 

For both update processes, the trajectories are contained in the 6 h- equilibrium set of Q 
after finite time with probability 1, i.e., let T be a random variable such that p^ G X^^, for 
all t > T, then P{T < oo) = 1. 
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Proof. We prove the claim by modeling the update process using a Markov chain, and 



employing the improvement path condition for potential games (cf. Proposition 2.1). 



Using Definition |4.1[ we can represent the strategy updates in best response dynamics as 
the state transitions in the following Markov chain: (i) Each state corresponds to a strategy 
profile and, (ii) there is a nonzero transition probability from state r to state q 7^ r, if r and 
q differ in the strategy of a single player, say m, and g™ is a (strict) best response of player 
m to r"™. The probability of transition from state r to state q is equal to the probability 
that at strategy profile r, player m is chosen for update and it chooses g"^ as its new strategy. 
In the case of better response dynamics we allow g™' to be any strategy strictly improving 
payoff of player m, and a similar Markov chain representation still holds. 

Since there are finitely many states, one of the recurrent classes of the Markov chain is 
reached in finite time (with probability 1). Thus, to prove the claim, it is sufficient to show 
that any state which belongs to some recurrent class of this Markov chain is contained in 
the e-equilibrium set of Q. 



It follows from Definition |4.1| that a recurrence class is a singleton, only if none of the 
players can strictly improve its payoff by unilaterally deviating from the corresponding strat- 
egy profile. Thus, such a strategy profile is a Nash equilibrium of Q and is contained in the 
e-equilibrium set. 

Consider a recurrence class that is not a singleton. Let r be a strategy profile in this 
recurrence class. Since the recurrence class is not a singleton, there exists some player m, 
who can unilaterally deviate from r by following its best response to another strategy profile 
q, and increase its payoff by some a > 0. Since such a transition occurs with nonzero 
probability, r and q are in the same recurrence class, and the process when started from r 
visits q and returns to r in finitely many updates. Since each transition corresponds to a 
unilateral deviation that strictly improves the payoff of the deviating player, this constitutes 
a simple closed improvement path containing r and q. Let 7 = (po, . . . , Pat) be such an 
improvement path and po = Pn = r, pi = q and N < \E\ = h. Since ti™(q) — M™(r) = a, 
and u'^'{pi) — m'"'(pj_i) > at every step i of the path, this closed improvement path 
satisfies 

N 
J2iu"''iP^)-U^'iP^-l))><y■ (2) 



On the other hand it follows by Proposition |2.1| that the close potential game satisfies 

N 



J](«™^(p.)-^™>(p._i)) = 0. (3) 

Combining ^ and (|3| we conclude that 

N 

a < 5^(«"^'(Pi) -w->(p.-i)) - (^™^(p.) -w™'(p,-i)) 



< NS. 



Since N < \E\ = h, it follows that a < 6h. The claim then immediately follows since 
r and the recurrence class were chosen arbitrarily, and our analysis shows that the payoff 
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improvement of player m (chosen for strategy update using a probability distribution with 



full support as described in Definition 4.1), due to its best response is bounded by 6h. D 



As can be seen from the proof of this theorem, extending dynamical properties of potential 
games to nearby games relies on special structural properties of potential games. As a 
corollary of the above theorem, we obtain that trajectories generated by better and best 
response dynamics converge to a Nash equilibrium in potential games, since if ^ is a potential 
game, the close potential game Q can be chosen such that d{Q, Q) = 0. 

5. Logit Response Dynamics 

In this section we focus on logit response dynamics. Logit response dynamics can be 
viewed as a smoothened version of the best response dynamics, in which a smoothing pa- 
rameter determines the frequency with which the best response strategy is picked. The 
evolution of the pure strategy profiles can be represented in terms of a Markov chain (with 
state space given by the set of pure strategy profiles). We characterize the stationary dis- 
tribution and stochastically stable states of this Markov chain (or of the update rule) in 
near-potential games. Our approach involves identifying a close potential game to a given 
game, and exploiting features of the corresponding potential function to characterize the 
limiting behavior of logit response dynamics in the original game. 



In Section 5.1 , we provide a formal definition of logit response dynamics and review some 



of its properties. We also present some of the mathematical tools used in the literature to 



study this update rule. In Section |5.2[ we show that the stationary distribution of logit 
response dynamics in a near-potential game can be approximately characterized using the 
potential function of a nearby potential game. We also use this result to show that the 
stochastically stable strategy profiles are contained in approximate equilibrium sets in near- 
potential games. 

5.1. Properties of Logit Response 

We start by providing a formal definition of logit response dynamics: 

Definition 5.1. At each time instant t G {1, 2, . . . }, a single player is chosen at random for 
updating its strategy, using a probability distribution with full support over the set of players. 
Let m be the player chosen at some time t, and let r E E denote the strategy profile that is 
used at time t — 1. 

Logit response dynamics with parameter r is the update process, where player m chooses 
a strategy g™ G E"^ with probability 

P™(g'"|r) 



In this definition, r > is a fixed parameter that determines how often players choose 
their best responses. The probability of not choosing a best response decreases as r decreases, 
and as r — )• 0, players choose their best responses with probability 1. This feature suggests 
that logit response dynamics can be viewed as a generalization of best response dynamics, 
where with small but nonzero probability players use a strategy that is not a best response. 
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For a given r > 0, this update process can be represented by a finite aperiodic and 
irreducible Markov chain (Alos- Ferrer and Netzer 2010, Marden and Shamma 2008). The 



states of the Markov chain correspond to the strategy profiles in the game. Denoting the 
probabihty that player m is chosen for a strategy update by am, transition probabihty from 
strategy profile p to q can be given by (assuming p 7^ q, and denoting the transition from 

p to q by p — 7- q): 



Pr{p ^ q) 



amP^^iq^lp) 







if q-'" = p- 
otherwise. 



for some m & Ai 



(4) 



The chain is aperiodic and irreducible since a player updating its strategy can choose any 
strategy (including the current one) with positive probability. Consequently, it has a unique 
stationary distribution. 

We denote the stationary distribution of this Markov chain by fir and refer to it as 
the stationary distribution of the logit response dynamics. A strategy profile q such that 
limT-_j.o/iT(q) > is referred to as a stochastically stable strategy profile of the logit response 
dynamics. Intuitively, these strategy profiles are the ones that are used with nonzero prob- 
ability, as players adopt their best responses more and more frequently in their strategy 
updates. 

In potential games, the stationary distribution of the logit response dynamics can be 
written as an explicit function of the potential. If ^ is a potential game with potential func- 
tion (j), the stationary distribution of the logit response dynamics is given by the distribution 
dAlos-Ferrer and Netzer]|2010| |Blume||1997"l [Marden and Shamma|[2008| ) f] 



,^0(q) 



/^r(qj 



E 



pes 



e^ 



(p) 



(5) 



It can be seen from (|5| that limT-__j.o/ir(q) > if and only if q G argmaXpg£;0(p). Thus, 
in potential games the stochastically stable strategy profiles are those that maximize the 
potential function. 

We next describe a method for obtaining the stationary distribution of Markov chains. 
This method will be used in the next subsection in characterizing the stationary distribution 
of logit response. Assume that an irreducible Markov chain over a finite set of states S, with 
transition probability matrix P is given. Consider a directed tree, T, with nodes given by 
the states of the Markov chain, and assume that an edge from node q to node p can exist 
only if there is a nonzero transition probability from q to p in the Markov chain. We say 
that the tree is rooted at state p, if from every state q 7^ p there exists a unique directed 
path along the tree to p. For each state p G 5", denote by T(p) the set of all trees rooted at 
p, and define a weight Wp >0 such that 



Wr 



E n ^(q^r) 

T6r(p) (q^r)eT 



(6) 



^Note that this expression is independent of {am}, i.e., the probabihty distribution that is used to choose 
which player updates its strategy has no effect on the stationary distribution of logit response. 
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The following proposition from the Markov Chain literature ([Anantharam and Tsoucas 



(1989), Freidlin and Wentzell (1998), Leighton and Rivest (1983)), known as the Markov 



chain tree theorem, expresses the stationary distribution of Markov chains in terms of these 
weights. 

Proposition 5.1. The stationary distribution of the Markov chain defined over set S is 
given by fi{p) = ^J^^^ . 

For any T G T(p), intuitively, the quantity n(q-s>r)GT-^(^ ~^ ^) gives a measure of 
likelihood of the event that node p is reached when the chain is initiated from the leaves 
(i.e., nodes with indegree equal to 0) of T. Thus, Wp captures how likely it is that node 



p is visited in this chain, and the normalization in Proposition 5.1 gives the stationary 
distribution. Since for finite games logit response dynamics can be modeled as an irreducible 
Markov chain, this result can be used to characterize its stationary distribution. 

5.2. Stationary Distribution of Logit Response Dynamics 

In this section we show that the stationary distribution of logit response dynamics in near- 
potential games can be approximated by exploiting the potential function of a close potential 
game. We start by showing that in games with small MPD logit response dynamics have 
similar transition probabilities. 

Lemma 5.1. Consider a game Q and let Q be a nearby potential game such that d{Q, Q) < 5. 
Denote the transition probability matrices of logit response dynamics in Q and Q by Pr and 
Pr respectively. For all strategy profiles p and q that differ in the strategy of at most one 
player, we have 

e-^ < Pr(p ^ q)/Pr(p ^ q) < e^. 



Proof. Assume that p "^ = q '^. In ^ the transition probability Pt(p -^ q) can be expressed 
by (see (g): 

( a„P,"^(g™|p) ifg^^p- 

^r(p ^ q) = < Y^ akP^{p^\p) otherwise. 

A similar expression holds for the transition probability -Pr(p — ;• q) in ^, replacing P™ by 
P™. Thus, it is sufficient prove e~~ < P™(g™|p)/P^(q'™|p) < e~ for all p, m, q^ to prove 
the claim. 

Observe that by the definition of MPD 

M"^(r'", p-"^) - M™(p"^, p-") -5< M'"(r'^, p~") - m"(p™, p-"^) 

<M™(r"^,p-'")-M"(p™,p-™) + 5. ^^^ 



Definition 5.1 



suggests that P™(g™'|p) can be written as (by dividing the numerator and the 



denominator by e^'"'"*-^'"'^ '"-'): 

i(M'"(g'",p-'")-ii'"(p™,p-™)) 

P;"(g™|p) 



E-(M'"(r'",p-'")-?i™(p™,p-'")) ' 
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Therefore, using the bounds in ^ it follows that 



IPJ 



K{q--)er + Zr^^grn fi:(r™)e^ 



where, ^(r™) = 6^*^"'"'^'"™'^ ""^ ^^Cp^.p ")) fQj. g^n ^m ^ ^m_ dividing both the numerator and 
the denominator of the right hand side by X]r'"eB'" '^('^™) and observing that P™(g™|p) = 
^^ ^"Vt^) we obtam 



Pr(np) < »^^"-fe'"IP) 



ipm(^gm|p) _^ g-- (1 _ p™(gm|p)) 



or equivalently 



pm(^^m|p) gl 



p™(g™|p) e^P™(g'"|p) + e-^ (1 - P-(g-|p)) ' 

It can be seen that the right hand side is decreasing in P^(g'"|p). Thus replacing P^{q"^\p) 
by 0, the right hand side can be upper bounded by e~ . Then we obtain P™(g'"|p)/P™(g'^|p) < 
e~ . By symmetry we also conclude that P^(g™|p)/P™(g'^|p) < e~, and combining these 
bounds the claim follows. D 



Definition 5. 1| suggests that perturbation of utility functions changes the transition prob- 



abilities multiplicatively in logit response. The above lemma supports this intuition: if utility 
gains due to unilateral deviations are modified by 6, the ratio of the transition probabilities 
can change at most by e^. Thus, if two games are close, then the transition probabilities of 
logit response in these games should be closely related. 

This suggests using results from perturbation theory of Markov chains to characterize the 



stationary distribution of logit response in a near-potential game (Cho and Meyer 2001 , Haviv 



and Van der Heyden 1984). However, standard perturbation results characterize changes in 
the stationary distribution of a Markov chain when the transition probabilities are additively 
perturbed. These results, when applied to multiplicative perturbations, yield bounds which 
are uninformative. We therefore first present a result which characterizes deviations from the 
stationary distribution of a Markov chain when its transition probabilities are multiplicatively 
perturbed, and therefore may be of independent interest 

Theorem 5.1. Let P and P denote the probability transition matrices of two finite irreducible 
Markov chains with the same state space. Denote the stationary distributions of these Markov 
chains by fi and fi respectively, and let the cardinality of the state space be h. Assume that 
a>l is a given constant and for any two states p and q, the following inequalities hold 

a'^P{p ^ q) < P(p ^ q) < aP(p -^ q). 



^'A multiplicative perturbation bound similar to ours, can be found in Freidlin and Wentzell (19981. 
However, this bound is looser than the one we obtain and it does not provide a good characterization of the 
stationary distribution in our setting. We provide a tighter bound, and obtain stronger predictions on the 
stationary distribution of logit response. 
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Then, for any state p, we have 



«-^'-'V(p) < ^(p) < cy'-'f-^ip) 



a''"^ - 1 
(u) |/i(p) -/i(p)| < 



Proof. As before, let T(p) denote the set of directed trees that are rooted at state p. Using 



the characterization of the stationary distribution in Proposition 5.1, for the Markov chain 
with probabihty transition matrix P, we have ;u(p) = y^"''^ , where for each state p, 

E n nx^y). 

rer(p) (x^y)er 



Wp 



For the Markov chain with probabihty transition matrix P, we define Wp, by replacing P in 
the above equation with P and /i(p) similarly satisfies /t(p) = y^"''^ . 

Since the Markov chain has h states, \T\ = h—1 for all T G T(p). Hence, it follows from 
the assumption of the theorem and the above definitions of Wp and Wp that 



a'^'^'^^Wp = a-('^-i) 



E n p(^^y 

rer(p) (x^y)er 
Ter(p) (x^y)er 
Ter(p) (x^y)eT 



a ifp. 



This inequality implies that for all q, w^ is upper bounded by a^ ^w^ and lower bounded 
by a~^'^~^^Wq. Using this observation together with the identity /i(p) = y!"'^^ , we obtain 



< A(P) = ^^^ < 



p 



a C" ^'^Wp + a^ ^Eq^p^q Eq^q «^ ^Wp + « (^ ^^ Eq^p ^q ' 

Dividing the numerators and denominators of the left and right hand sides of the inequality 



by Eq'^^q; using Proposition 5.1, and observing that Eq=^p/^(^) — ^ ~ /^(p) the first part of 
the theorem follows. 

Consider functions / and g defined on [0, 1] such that /(x) = ^.h-i^ ,"-(ft-i)Q_^-, — x and 

g{x) = a-(f^-'^)x+ah-^(i-x) ~ 3; for X G [0, 1]. Checking the first order optimality conditions, 

it can be seen that f{x) is maximized at x = " ^^^-i) ? ^lud the maximum equals to '^h^i^\ - 

Similarly, the minimum of g{x) is achieved at x = -^" ,,_i and is equal to |T°fc-i . Combining 
these observations with part (i), we obtain 

IW^ - ^^^(P)) = «-('^-)Mp) + a'^-(l-Mp)) " ^^^^ - ^^^^ " ^^^^ 

a'^-V(p) .X ... XX a''-^ - 1 



hence the second part of the claim follows. D 



Next we use the above theorem to relate the stationary distributions of logit response 
dynamics in nearby games. 

Corollary 5.1. Let Q and Q he finite games with number of strategy profiles \E\ = h, such 
that d{Q, Q) < 6. Denote the stationary distributions of logit response dynamics in these 
games by ^r, (ind fir respectively. Then, for any strategy profile p we have 



(i) 



m 



2S{h-l) 

e -r /i^(p) 



2^(fe-l) 

e T 



/^r(p) 



e -r ^r{p) + e - (l-/i^(p)) 



— f^ryP) S 2S(h-l) 25(h-l) 

e - /i^(p) + e r (1 



/^r(p)) 



l/^r(p) -/ir(p)| < 



2i5(fe-l) 

e ^ 



2(5(fe-l) 

e r 



Proof. Proof follows from Lemma 



5.1 



and Theorem 



5.1 



by setting a 



2S 

e r . 



D 



The above corollary can be adapted to near-potential games, by exploiting the relation of 
stationary distribution of logit response and potential function in potential games (see ([s])). 
We conclude this section by providing such a characterization of the stationary distribution 
of logit response dynamics in near-potential games. 

Corollary 5.2. Consider a game Q and let Q be a nearby potential game such that d{Q, Q) < 
6. Denote the potential function ofQ by 0, and the number of strategy profiles in these games 
by \E\ = h. Then, the stationary distribution fir of logit response dynamics in Q is such that 



(^) 



,i(</.(p)-25(/i-l)) 



gl(0(p)-25(?»-l)) _j_ ^ 



q^^pe-B 



,i(0(q)+25(ft-l)) 



< 



< /^r(p) 



,iWp)+25(h-l)) 



giWp)+2<5(/.-l)) _j_ ^ 



qv^pes 



,i(</.{q)-25(h-l))^ 



m 



e-r 



(p) 



f^AP) 



E 



qe-B 



e^ 



(q) 



< 



2S{h-l) 

e r — 1 

2S(h-l) • 

e^^ + 1 



Proof. Proof follows from Corollary 5.1 and ([s]) 



D 



With simple manipulations, it can be shown that (e^ — l)/(e^ + 1) ^ 3;/2 for x > 0. Thus, 



(ii) in the above corollary implies that 



MP) 



^i'^lP) 



E, 



qe-B 



;:f0{q) 



< 



5{h-l) 



Therefore, the station- 



ary distribution of logit response dynamics in a near-potential game can be characterized in 
terms of the stationary distribution of this update rule in a close potential game. When r 
is fixed and 5 — )■ 0, i.e., when the original game is arbitrarily close to a potential game, the 
stationary distribution of logit response is arbitrarily close to the stationary distribution in 
the potential game. On the other hand, for a fixed 5, as r — )■ 0, the upper bound in (ii) 
becomes uninformative. This is the case since r — )■ implies that players adopt their best 
responses with probability 1, and thus the stationary distribution of the update rule becomes 
very sensitive to the difference of the game from a potential game. In this case we can still 
characterize the stochastically stable states of logit response using the results of Corollary 
5.2 as we show in Corollary 5.3[ 
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Corollary 5.3. Consider a game Q and let Q he a nearby potential game with potential 
function (p and d{Q,Q) < 6. Denote the potential function of Q by (p, and the number of 
strategy profiles in these games by \E\ = h. The stochastically stable strategy profiles of Q 
are (i) contained in S = {p|0(p) > maxq(/)(q) — 45(/i — 1)}, (ii) 46 h- equilibria of Q . 



Proof, (i) The upper bound in the first part of Corollary 5.2 implies that if p is a strategy 
profile such that 0(p) < maxqg£;0(q) — 4(5(/i — 1), then the stationary distribution of logit 
response in Q is such that fJ^rip) — )■ as r — )■ 0. Thus, it immediately follows that the 
stochastically stable states in Q are contained in {p G E\(j){p) > maxqg^; 0(q) — 4:6{h — 1)}. 
(ii) From the definition of S it follows that in Q, none of the players can deviate from 
a strategy profile in S and improve its utility by more than 46{h — 1). Since d{Q,Q) < 
6 it follows from part (i) that in Q, none of the players can unilaterally deviate from a 
stochastically stable strategy profile and improve its utility by more than 46{h — l) + 6 < 45h. 
Hence stochastically stable strategy profiles of Q are 4^/i-equilibria. D 

We conclude that in near-potential games, the stochastically stable states of logit re- 
sponse are the strategy profiles that approximately maximize the potential function of a 
close potential game. This result enables us to characterize the stochastically stable states 
of logit response dynamics in near-potential games, without explicitly computing the sta- 
tionary distribution. 

6. Fictitious Play 

In this section, we investigate the convergence behavior of fictitious play in near-potential 
games. Unlike better /best response dynamics and logit response, in fictitious play agents 
maintain an empirical frequency distribution of other players' strategies and play a best 
response against it. Thus, analyzing fictitious play dynamics requires the notion of mixed 



strategies and some additional definitions that are provided in Section |6.1[ In Section |6.2 
we show that in finite games the empirical frequencies of fictitious play converge to a set 
which can be characterized in terms of the approximate equilibrium set of the game and 
the level sets of the potential function of a close potential game. When the original game 
is sufficiently close to a potential game, we strengthen this result and establish that the 
empirical frequencies converge to a small neighborhood of mixed equilibria of the game, 
and the size of this neighborhood is a function of the distance of the original game from a 



potential game. As a special case, our result allows us to recover the result of Monderer and 



Shapley (1996a), which states that in potential games the empirical frequencies of fictitious 



play converge to the set of mixed Nash equilibria. 

6.1. Mixed Strategies and Equilibria 

In this section, we introduce some additional notation and definitions, which will be 



used in Section |6.2| when studying convergence properties of fictitious play in near-potential 
games. 

We start by introducing the concept of mixed strategies in games. For each player 
m e A1, we denote by Ai5™ the set of probability distributions on E^. For x™' G AE"^, 
x"^{p"^) denotes the probability player m assigns to strategy p"^ G -E"". We refer to the 
distribution x"^ G AE"^ as a mixed strategy of player m G A4 and to the collection x = 
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{x"^}m£M ^ rim ^-^™ ^^ a mixed strategy profile. The mixed strategy profile of all players 
but the mth one is denoted by x~™. We use 1 1 ■ 1 1 to denote the standard 2- norm on Ylm ^E"^, 
i.e., for X E Um^E^^ we have ||x||2 = ^^^^ E^m^E- {x^iv^f- 

By slight (but standard) abuse of notation, we use the same notation for the mixed 
extension of utility function u^ of player m G A^, i.e.. 



M™ 



;x) = 5^n-(p)n^'(/), 

pes k&M 



for all X G Hm ^E'^. In addition, if player m uses some pure strategy g™ and other players 
use the mixed strategy profile x~™, the payoff of player m is denoted by 

M'"(g"^,x-'") = Y^ M'"(g'",p-™) W x^{p^). 

Similarly, we denote the mixed extension of the potential function by 0(x), and we use the 
notation (/)(g"*,x~'") to denote the potential when player m uses some pure strategy q"^ and 
other players use the mixed strategy profile x"™. 

A mixed strategy profile x = {x"^}meM ^ Ilm^-^™ ^^ ^ mixed e- equilibrium if for all 
m E M and p™ E £"", 

m'"(p'", x-"^) - M™(x'", X-™) < e. (9) 

Note that if the inequality holds for e = 0, then x is referred to as a mixed Nash equilibrium 
of the game. In the rest of the paper, we use the notation X^ to denote the set of mixed 
e-equilibria. 

Our characterization of the limiting mixed strategy set of fictitious play depends on the 
number of players in the game. We use M = \Ai\ as a short-hand notation for this number. 

We conclude this section with two technical lemmas which summarize some properties 
of mixed equilibria and mixed extensions of potential and utility functions. Proofs of these 
lemmas can be found in the Appendix. 

The first lemma establishes the Lipschitz continuity of the mixed extensions of the payoff 
functions and the potential function. It also shows a natural implication of continuity: 
for any e' > e, a small enough neighborhood of the e-equilibrium set is contained in the 
e'-equilibrium set. 

Lemma 6.1. (i) Let v : Ylm&M-^"^ -^ M. be a mapping from, pure strategy profiles to 
real numbers. Its mixed extension is Lipschitz continuous with a Lipschitz constant of 
^YjpeE I'^(P)I 0^6*" ^^6 domain Ylm&M ^^"'■ 

(ii) Let a > and 7 > 5e given. There exists a small enough 9 > Q such that for any 
l|x ~ y|| < ^ ^/x G Xa, then y G Xa+y. 

Lipschitz continuity follows from the fact that mixed extensions are multilinear functions 
(p|, with bounded domains. The proof of the second part immediately follows from the 
Lipschitz continuity of mixed extensions of payoff functions and the definition of approximate 
equilibria (|9]). Note that the second part implies that for any e' > 0, there exists a small 
enough neighborhood of equilibria that is contained in the e'-equilibrium set of the game. 

We next study the continuity properties of the approximate equilibrium mapping. We 



first provide the relevant definitions (see Berge (1963), Fudenberg and Tirole (1991)). 
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Definition 6.1 (Upper Semicontinuous Function). A function g : X -^ Y G M. is upper 
semicontinuous at x*, if, for each e > there exists a neighborhood U of x^, such that g{x) < 
g{x^) + e for all x E U . We say g is upper semicontinuous , if it is upper semicontinuous at 
every point in its domain. 

Alternatively, g is upper semicontinuous if limsup^^^^^ g{xn) < g{x^) for every x^ in its 
domain. 

Definition 6.2 (Upper Semicontinuous Correspondence). A correspondence g : X ^ Y is 
upper semicontinuous at x^,, if for any open neighborhood V of g{x^) there exists a neighbor- 
hood U of x^, such that g{x) C V for all x G U. We say g is upper semicontinuous, if it is 
upper semicontinuous at every point in its domain and g{x) is a compact set for each x G X . 
Alternatively, when Y is compact, g is upper semicontinuous if its graph is closed, i.e., 
the set {{x,y)\x E X,y G g{x)} is closed. 

We next establish upper semicontinuity of tlie approximate equilibrium mappingjj 

Lemma 6.2. (i) Let v : YlmeM'^-^"^ -^ M. be an upper semicontinuous function. The 
correspondence (7 : M =^ YlmeM'^'^"^ such that g{v) = {x|z/(x) > — f} is upper semi- 
continuous. 

(a) Let g : M. ^ IlmeA^ ^E^ be the correspondence such that g{a) = X^. This correspon- 
dence is upper semicontinuous . 

Upper semicontinuity of the approximate equilibrium mapping implies that for any given 
neighborhood of the e-equilibrium set, there exists an e' > e such that e'-equilibrium set 
is contained in this neighborhood. In particular, this implies that every neighborhood of 
equilibria of the game contains an e'-equilibrium set for some e' > 0. Hence, if disjoint neigh- 
borhoods of equilibria are chosen (assuming there are finitely many equilibria), this implies 
that there exists some e' > 0, such that the e'-equilibrium set is contained in disjoint neigh- 
borhoods of equilibria. In the next section, we use this observation to establish convergence 
of fictitious play to small neighborhoods of equilibria of near-potential games. 

6.2. Discrete- Time Fictitious Play 

Fictitious play is a classical update rule studied in the learning in games literature. In 



this section, we consider the fictitious play dynamics, proposed in Brown (1951), and explain 
how the limiting behavior of this dynamical process can be characterized in near-potential 
games. In particular, we show that the empirical frequencies of fictitious play converge to a 
set which can be characterized in terms of the e-equilibrium set of the game, and the level 
sets of the potential function of a close potential game. We also establish that for games 
sufficiently close to a potential game, the empirical frequencies of fictitious play converge 
to a neighborhood of the (mixed) equilibrium set. Moreover, the size of this neighborhood 
depends on the distance of the original game from a nearby potential game. This generalizes 



^ Here we fix the game, and discuss upper semicontinuity with respect to the e parameter characterizing 
the e-equihbrium set. We note that this is different than the common results in the literature which discuss 
upper semicontinuity of the equilibrium set with respect to changes in the utility functions of the underlying 



game (see Fudenberg and Tirole (1991|). 
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the result of Monderer and Shapley ( 1996a), on convergence of empirical frequencies to mixed 
Nash equilibria in potential games. 

In this paper, we only consider the discrete-time version of fictitious play, i.e., the update 
process starts at a given strategy profile at time t = 0, and players can update their strategies 
at discrete time instants t E {1, 2, . . . }. Throughout this subsection we denote the strategy 
used by player m at time instant t by p^, and we denote by l(pj" = p™") the indicator 
function which equals to 1 ii p]^ = p^, and otherwise. A formal definition of discrete-time 
fictitious play dynamics is given next. 

Definition 6.3 (Discrete-Time Fictitious Play). Let /i^(g"^) = ji '^t=o ^(pT = Q"^) denote 
the empirical frequency that player m uses strategy q^ from time instant to time instant 
T — 1, and /i^"* denote the collection of empirical frequencies of all players hut m. A game 
play, where at each time instant t, every player m, chooses a strategy p^ such that 

pT G arg max u"'(g™,ur™') 

is referred to as discrete-time fictitious play. That is, fictitious play dynamics is the update 
process, where each player chooses its best response to the empirical frequencies of the actions 
of other players. 

We refer to /zj" as the distribution of empirical frequencies of player m's strategies at 
time t. Note that /i^ can be thought of as vector with length \E^\, whose entries are 
indexed by strategies of player m, i.e., /i™(p™) denotes the entry of the vector corresponding 
to the empirical frequency player m uses strategy p™ with. Similarly, we define the joint 
empirical frequency distribution of all players as Ht = {t^T}mi^M- Note that ji^ G AE"^, i.e., 
empirical frequency distributions are mixed strategies, and similarly fit G Ilmex AE"^. 

Observe that the evolution of this empirical frequency distribution can be captured by 
the following equation: 

/^.+i = ^/^. + ^/*, (10) 

where It = {ir'}meMy ^^^ ^T i^ a vector which has the same size as /ij" and its entry 
corresponding to strategy p™ is given by J™(p™) = l(p™ = p^). Rearranging the terms in 



(10), and observing that It, l^t G Ilmex AE'^ are vectors with entries in [0, 1] we conclude 



\\^^t+l- ^it\\ = ^— Yll^t-/^tll = l-j , (11) 

where O(-) stands for the big-0 notation, i.e., f{x) = 0{g{x)), implies that there exists some 
Xo and a constant c such that |/(a;)| < c|f7(x)| for all x > xq. 

We start analyzing discrete-time fictitious play in near-potential games, by first focusing 
on the change in the value of the potential function along the fictitious play updates in 
the original game. In particular, we show that in near-potential games if the empirical 
frequencies are outside some e-equilibrium set, then the potential of the close potential game 
(evaluated at the empirical frequency distribution) increases by discrete-time fictitious play 
updates, 



10 



^'^Our approach here is similar to the one used in Monderer and Shapley (1996a I to analyze discrete-time 
fictitious play in potential games. 
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Lemma 6.3. Consider a game Q and let Q he a close potential game such that d{Q, Q) < 5. 
Denote the potential function of Q by (p. Assume that in Q players update their strategies 
according to discrete-time fictitious play dynamics, and at some time instant T > 0, the 
empirical frequency distribution fiT is outside an e-equilibrium set of Q . Then, 

0(/iT+i) - 0(/ir) > ^^^ + O (^A . 

Proof. Consider the mixed extension of the potential function 0(x) = Xlpe£; '^(p) IlmeA^ x"^{p" 
where x = {x"^}m and x^{p"^) denotes the probabihty player m plays strategy p™. The ex- 
pression for 0(x) implies that Taylor expansion of (p around fix satisfies 

0(/XT+i) = <p{fiT) + E E (^'T\lipn - f^Tipnmp'^,f^Tn + odi/iT+i - /iTin. 



Observing from (10) that fit+i — l^t = ~j~pi{h — fJ't), and noting from (11) that \\fit+i — /^t 



O (^) the above equality can be rewritten as 



mGyVIp^e-B" 



Rearranging the terms, and noting that Xlp^es™ fi'TiP"^)^iP"^^ /^t™) — ^if^Ty A^t™)' i^ follows 
that 

Since d[Q, Q) < 6, the above equality and the definition of MPD imply 

0(;^T+i) > 0(/xt) + ^ E (""'(Pt , f^rn - n'-if^T, f^T^ -S)+0 (^) . (12) 



mGM 



By definition of the fictitious play dynamics, every player m plays its best response to /x^*", 
therefore M"*(p^,/i^"') — M'^(/i^,/i^"') > for all m. Additionally, if fir is outside the 
e-equilibrium set, as in the statement of the lemma, then it follows that ■u™'(p™,/i^™') — 



u"^{jjjp, IJ^j,"^) > e for at least one player. Therefore, (12) implies 



N ^ ./ N e-M6 ^ / 1 
0(/^T+l) > 0(/^t) + ^T^^ + ^ ( ^ 

hence, the claim follows. D 

The above theorem implies that if fix is not in the e-equilibrium set for some e > MS, 
and T sufficiently large, then the potential evaluated at empirical frequencies increases when 
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players update their strategies. Since the mixed extension of the potential is a bounded 
function, the potential cannot increase unboundedly, and this observation suggests that the 
e-equilibrium set is eventually reached by the empirical frequency distribution. On the other 
hand, at a later time instant ht can still leave this equilibrium set, and before it does so the 
potential cannot be lower than the lowest potential in this set (since /i-r itself belongs to this 
set). Moreover, after /x^ leaves the e-equilibrium set the potential keeps increasing. Thus, the 
empirical frequencies are contained in the set of mixed strategy profiles, which have potential 
at least as large as the minimum potential in this approximate equilibrium set. We next make 
this intuition precise, and characterize the set of limiting mixed strategies for fictitious play 
in near-potential games. We adopt the following convergence notion: we say that empirical 
frequencies of fictitious play converge to a set 5 C HmeAi ^E"^, if infxes ||/it — x|| — )• as 

t — 7- OO. 

Theorem 6.1. Consider a game Q and let Q he a close potential game such that d{Q, G) < S. 
Denote the potential function of Q by 0. Assume that in Q players update their strategies 
according to discrete-time fictitious play dynamics, and let X^ denote the a-equilibrium set 
of Q. For any e > 0, there exists a time instant T^ > such that for all t > T^ 



^, e a = < X G Yl AE"^ 



rneM 



(x) > min 0(y) 

y&'^MS + e 



Proof. Let e' be such that e > e' > 0. It can be seen from the definition of C^ that Xms+e' C 
Xms+e C Ce- We prove the claim in two steps: (i) We first show that in this update 
process ^Ms+e' is visited infinitely often by fit, i.e., for all T', there exists t > T' such that 
fit € '^MS+e', (ii) We prove that there exists a T" such that if fit G C^ for some t > T", 
then for all t' > t we have fif G C^. Thus, the second step guarantees that if C^ is visited 
at a sufficiently later time instant, then fit remains in C^. Since Xms+e' C C^ the first step 
ensures that such a time instant exists, and the claim in the theorem immediately follows 
from (ii). Moreover, this time instant corresponds to T^ in the theorem statement. 

Proof of both steps rely on the following simple observation: Lemma 6.3| implies that 



there exists a large enough T, such that if the empirical frequencies do not belong to Xp^s+e' 
at a time instant t > T, then increases: 

We prove (i) by contradiction. Assume that there exists a T' such that fit ^ ^Ms+e' for 



t > T', and let T^ = max{T, T'}. Then, (13) holds for all t = {T^ + 1, . . . }, and summing 



both sides of this inequality over this set we obtain 

oo 

limsup0(/ii+i) - 0(/iT,„+i) > Yl 



*^~ ■ ~+i ^^^ 



Since the mixed extension of the potential is a bounded function, it follows that the left 
hand side of the above inequality is bounded, but the right hand side grows unboundedly. 
Hence, we reach a contradiction, and (i) follows. 
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Lemma 6.1 (ii) implies that there exists some ^ > such that if a strategy profile x 
is an {M5 + e')-equilibrium, then any strategy profile y that satisfies ||x — y|| < ^ is an 
(M5 + e)-equilibrium (recall that e > e' > 0). Since ||/it+i — /it|| = 0{l/t) by (11), this 
implies that there exists some T" > T, such that for all t > T" if fit £ ^MS+e', then we have 



fj't+i e X, 



MS+e 



(14) 



, then by (14) fit+i G XMs+e C C, 



Let fit £ C^ for some time instant t > T" . li fit ^ '^m5-\ 
If, on the other hand, fit E C^ — Xus-^-e', then by (13) and the definition of C^ we have 



0(/it+i) > (t){fit) > min 0(y), 



(15) 



and hence fit+i G C^. Thus, we have established that there exists some T" such that if 
fit G Ce for some t > T", then fit+i G C^, and hence (ii) follows. D 

The above theorem establishes that after finite time fit is contained in the set C^ for any 



e > 0. Corollary 6J^, establishes that in the limit this result can be strengthened: as t — )■ oo, 
fit converges to a set, which is a subset of C^ for every e > 0. The proof can be found in the 
Appendix. 

Corollary 6.1. The empirical frequencies of discrete-time fictitious play converge to 



C 



X G 



n 



AE'' 



(x) > min 0(y) 
yGA'Mi 



This result suggests that in near-potential games, the empirical frequencies of fictitious 
play converge to a set where the potential is at least as large as the minimum potential in 
an approximate equilibrium set. For exact potential games, it is known that the empiri- 
cal frequencies converge to a Nash equilibrium (Monderer and Shapley 1996a). It can be 
seen from Definition 



2.1 



that in potential games, maximizers of the potential function are 
equilibria of the game. Thus, in potential games with a unique equilibrium the equilibrium 
is the unique maximizer of the potential function. Hence, for such games, we have 6 = 0, 
miriy^XMs 0(y) = HiaXxe]-[^g^ AE"! 0(x), and Corollary 6.1 implies that empirical frequencies 



of fictitious play converge to the unique equilibrium of the game, recovering the convergence 



result of Monderer and Shapley (1996a). However, when there are multiple equilibria Corol- 



lary |6.1| suggests that empirical frequencies converge to the set of mixed strategy profiles 
that have potential weakly larger than the minimum potential attained by the equilibria. 
While this set contains equilibria, it may contain a continuum of other mixed strategy pro- 
files. This suggests that in games with multiple equilibria our result may provide a loose 
characterization of the limiting behavior of fictitious play dynamics. 

We next show that by exploiting the properties of mixed approximate equilibrium sets, 
it is possible to obtain a stronger result. Before we present our result, we discuss a feature of 
mixed equilibrium sets which will be key in our analysis: For small e, the e-equilibrium set 



is contained in a small neighborhood of equilibria (this statement follows from Lemma 6.2 
(ii) by considering the upper semicontinuity of the approximate equilibrium correspondence 
g{a) at q; = 0). This property is illustrated in Example 6.1 
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2, 3 



Table 3: Payoffs in BoS. 



Example 6.1 (Mixed equilibrium set of Battle of the Sexes:). Consider the two-player battle 
of the sexes (BoS) game: Each player has two possible actions {0,F}, and the payoffs of 
players are as given in Table^ This game has three equilibria: (i) both players use O, (ii) 
Both players use F , (Hi) Row player uses O with probability 0.6, and column player uses O 
with probability 0.4. Note that since this is a game where each player has only two strategies, 
the probability of using strategy O, in the third case uniquely identifies the corresponding 
mixed equilibrium. For different values of e, the set of e-equilibria of this game is shown 
in Figure [^ It follows that the set of e-equilibria is contained in disjoint neighborhoods of 
equilibria for small values of e. 
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(a) 0.2-eqmlibrium set. 
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(b) 0.3-equilibriuni set. 
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(c) 0.4-equilibrmm set. 

Figure 4: Approximate equilibrium sets in BoS are contained in disjoint neighborhoods of 
equilibria for small e. 
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It was established in Lemma 6.3 that the potential function of a nearby potential game 
(with MPD 6 to the original game), evaluated at the empirical frequency distribution, in- 
creases when this distribution is outside the M(5-equilibrium set of the original game (where 
M is the number of players). If 6 is sufficiently small, then the M(5-equilibria of the game 
will be contained in a small neighborhood of the equilibria, as illustrated above and shown in 



Lemma 6.2 (ii). Thus, for sufficiently small 6, it is possible to establish that the potential of 



a close potential game increases outside a small neighborhood of the equilibria of the game. 



In Theorem 6^, we use this observation to show that for sufficiently small 5 the empirical 
frequencies of fictitious play dynamics converge to a neighborhood of an equilibrium. We 
state the theorem under the assumption that the original game has finitely many equilibria. 
This assumption generically holds, i.e., for any game a (nondegenerate) random perturbation 



of payoffs will lead to such a game with probability one (see Fudenberg and Tirole (1991)). 



When stating our result, we make use of the Lipschitz continuity of the mixed extension of 



the potential function, as established in Lemma 6.1 



Theorem 6.2. Consider a game Q and let Q he a close potential game such that d{Q, Q) < 5. 
Denote the potential function of Q by (p, and the Lipschitz constant of the mixed extension of 
(p by L. Assume that Q has finitely many equilibria, and in Q players update their strategies 
according to discrete-time fictitious play dynamics. 

There exists some 5 > 0, and e > (which are functions of utilities of Q but not 6) such 
that if 6 < 6, then the empirical frequencies of fictitious play converge to 

|x — Xfcll < h f{M5 + e), for some equilibrium x^ > , (16) 



for any e such that e > e > 0, where f : IR+ — ?■ ]R_(. is an upper semicontinuous function 
satisfying f{x) — )■ as x ^ 0. 

The proof of this theorem can be found in the Appendix, and it has three main steps illus- 
trated in Figures |5] and [6j As explained earlier, for small 6 and e, the M(5 + e-equilibrium set 
of the game is contained in disjoint neighborhoods of the equilibria of the game. Lemma [6l3 



implies that potential evaluated at fit increases outside this approximate equilibrium set with 
strategy updates. In the proof, we first quantify the increase in the potential, when fit leaves 
this approximate equilibrium set and returns back to it at a later time instant (see Figure 



5a). Then, using this increase condition we show that for sufficiently large t, Ht can visit 



the approximate equilibrium set infinitely often only around one equilibrium, say x^/ (see 



Figure 5b). This holds since, the increase condition guarantees that the potential increases 
significantly when fit leaves the neighborhood of an equilibrium x^, and reaches to that of 
Xfc/. Finally, using the increase condition one more time, we establish that if after time T, 
fit visits the approximate equilibrium set only in the neighborhood of x^/, we can construct 
a neighborhood of x^,/, which contains fit for all t > T (see Figure [6]). This neighborhood is 



expressed in (16). 



Observe that ii 5 = 0, i.e., the original game is a potential game, then f{M6) = 0, 



and Theorem 6.2 implies that empirical frequencies of fictitious play converge to the /(e)- 
neighborhood of equilibria for any e such that e > e > 0. Thus, choosing e arbitrarily small, 
and observing that limx^of{x) = 0, our result implies that in potential games, empirical 
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^M > (l^ipt 





(a) If empirical frequencies leave an approximate 
equilibrium set at time t, and return back to it at 
t', then (f){fj.t') > (/>(Mt)- 





(b) For sufficiently large t, fj,t visits the component 
of the approximate equilibrium set contained in the 
neighborhood of a single equilibrium. 



Figure 5: For small 6 and e, M5 + e-equilibrium set (enclosed by solid lines around equilibria 
Xfc/ and Xfc) is contained in disjoint neighborhoods of equilibria. If the empirical frequency 
distribution, fit, is outside this approximate equilibrium set, then the potential increases with 
each strategy update. Assume that empirical frequency distribution leaves an approximate 
equilibrium set (at time t) and returns back to it at a later time instant {f > t). We first 
quantify the resulting increase in the potential (left). If fit travels from the component of the 
approximate equilibrium set in the neighborhood of equilibrium x^ to that in the neighbor- 
hood of equilibrium x^/, then the increase in the potential is significant, and consequently 
fit cannot visit the approximate equilibrium set in the neighborhood of equilibrium x^ at a 
later time instant (right). 




Figure 6: If after time T, fit only visits the approximate equilibrium set in the neighborhood 
of a single equilibrium x^/, then we can establish that fit never leaves a neighborhood of this 
equilibrium for t > T. The size of this neighborhood is denoted by r in the figure and can 



be expressed as in Theorem 6.2 
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frequencies converge to the set of Nash equihbria. Hence, as a special case of Theorem 6.2 



we obtain the convergence result of Monderer and Shapley (1996a). 



Assume that 5 7^ and a small e < e is given. If 5 is sufficiently small then f{M6)/e ^ 0, 
since lima;_!.o f{x) = 0. Consequently, — — h f{MS + e) is small, and Theorem 6.2 



establishes convergence of empirical frequencies to a small neighborhood of equilibria. Thus, 



we conclude that for games that are close to potential games, i.e., for 5^1, Theorem |6. 2 
establishes convergence of empirical frequencies to a small neighborhood of equilibria. 

7. Conclusions 

In this paper, we present a framework for studying the limiting behavior of adaptive 
learning dynamics in finite strategic form games by exploiting their relation to nearby po- 
tential games. We restrict our attention to better /best response, logit response and fictitious 
play dynamics. We show that for near-potential games trajectories of better /best response 
dynamics converge to e-equilibrium sets, where e depends on closeness to a potential game. 
We study the stochastically stable strategy profiles of logit response dynamics and prove that 
they are contained in the set of strategy profiles that approximately maximize the potential 
function of a nearby potential game. In the case of fictitious play we focus on the empirical 
frequencies of players' actions, and establish that they converge to a small neighborhood of 
equilibria in near-potential games. Our results suggest that games that are close to a poten- 
tial game inherit the dynamical properties (such as convergence to approximate equilibrium 
sets) of potential games. Additionally, since a close potential game to a given game can be 
found by solving a convex optimization problem, as discussed in Section [3} this enables us 
to study dynamical properties of strategic form games by first identifying a nearby poten- 
tial game to this game, and then studying the dynamical properties of the nearby potential 
game. 

The framework presented in this paper opens up a number of interesting research direc- 
tions. Among them, we mention the following: 

Heterogeneous update rules:. In this paper we only analyzed the update rules in which players 
update their strategies using the same mechanism. For instance, we assumed that all players 
adopt best response, or logit response dynamics with the same parameter. The limiting 
behavior of dynamic processes, where players adhere to different update rules is still an open 
question, even for potential games. An interesting future research question is whether the 
techniques in this paper can be used to understand the limiting behavior of such update 
rules. For example, consider a potential game where all players update their strategies using 
logit response with different but "close" r parameters. Can the outcome of this dynamic 
process be approximated with the outcome of logit response in a close potential game where 
all players use the same parameter for their updates? 

Guaranteeing desirable limiting behavior:. Another promising research direction is to use 
our understanding of simple update rules, such as better /best response and logit response 
dynamics to design mechanisms that guarantee desirable limiting behavior, such as low 
efficiency loss and "fair" outcomes. It is well known that equilibria in games can be very 



different in terms of such properties (Roughgarden 2005). Hence, it is of interest to develop 



30 



update rules that converge to a particular equilibrium, thus providing equilibrium refinement 
in the limit, or to find mechanisms that modify the underlying game in a way that can induce 
desirable limiting behavior. It has been shown in some cases that simple pricing mechanisms 



can ensure convergence to desirable equilibria in near-potential games (Candogan et al. 



2010a). It is an interesting research direction to extend such mechanisms to general games. 



Dynamics in "near" zero-sum and supermodular games:. Dynamical properties of simple 



update rules in zero-sum games and supermodular games are also well understood (Milgrom 



and Roberts 1990, Shamma and Arslan 2004). If a game is close to a zero-sum game or 



a supermodular game, does it still inherit some of the dynamical properties of the original 
game? If such "continuity" properties do not hold, then the results on dynamical properties 
of these classes of games may be fragile. Hence, it would be interesting to investigate whether 
analogous results to the ones in this paper can be established for these classes of games. 
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Appendix A. Proofs of Section |6] 



Proof of Lemma 6.1:. (i) The mixed extension of v can be given as in dSl): 



Z/i 



peE keM 



Hence z/(x) is equal to the sum of Lipschitz continuous functions {^^(p) HfeeA^ ^^(^''Olpes- 
The claim follows since, as a function of x G rimex ^E"^, the Lipschitz constant of 
uijp) rifcex ^''ip'') is bounded by z/(p)a/M < z/(p)M. 
(ii) Let z/ : Hmex ^E"^ — )■ M be a function such that 

z/(x) = - 



max 



(m™(p"^,: 



M™rx™,x'™ 



'))• 



(A.i: 



It follows from the definition of e-equilibrium that a strategy profile x is an e-equilibrium if 
and only if z/(x) > — e. 

By (i), it follows that mixed extensions of utility functions are Lipschitz continuous. 
Thus, the difference 'u'^(p'",x~™') — m™(x™,x~'^) is Lipschitz continuous in x. Since z/ is 
obtained from maximum of finitely many such functions, we conclude that it is Lipschitz 
continuous with some constant L. It follows that if ||x — y|| < 6, then |z/(x) — i'{y)\ < L6. 
Thus, choosing 6 < '~f/L, and recalling that x is an e-equilibrium if and only if z/(x) > — e, 
the claim follows. D 



Proof of Lemma 6.2:. (i) Consider the graph of g, i.e., S = {{v,x.)\v G M, x G g{v)}. The 
definition of g suggests that S can alternatively be written as 



S 



U^,x G 



Y[ AE'' 



m£M 



Z/ X > 



(A.2) 



z/x + 



Since u is upper semicontinuous, the function h : Kx Jl^g^^ AE"^, such that h{v, x 
V, is also an upper semic ontinuous fun ction. Since upper level sets of upper semicontinuous 
functions are closed (see Berge ( |l963 )), the set {(f,x) G M x Hmex ^E^\h{v,x.) > O}, or 
equivalently S* is a closed subset of M x Hmex ^E"^. Thus, the graph of g is closed, and 
since HmeA^ AE"^ is a compact set the claim follows from Definition 6.2 
(ii) Let z/ : UmeM ^^"^ ^ M be as in ( |Al| . i.e.. 



z/x 



max (■u™(p'^,x" 

meM,p"^eE^ ^ 



m'"(x'",x-™)). 



As explained in the proof of Lemma 6.1, x is an e-equilibrium if and only if z/(x) > — e. 
The claim follows from part (i) by noting that z/ is a continuous function, and g{a) = X^ = 
{x|z/(x) > -a}. D 
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Proof of Corollary 6. 1 :. Let e„ = M5+ - for n G Z_|_. Observe that since the mixed extension 
of the potential function is continuous, C and C^^ are closed sets for any n G Z_|_. Since C is 
closed minygc* ||x — y|| is well-defined for any x G Ilmex ^E"^. 
We claim that for any ^ > the set 






min llx — vll < 9 



(A.3) 



is such that C. C Sg for some n. Note that if this claim holds, then it follows from 



Theorem |6.1| that there exists some Tg such that for all t > Tg we have fit G Sg. Using the 

(A.4) 



definition of Sg given in (A.3), this implies 



limsup min ||x — /ifll < ^. 



Moreover, since ^ > is arbitrary, and ||x — /xt|| > 0, using (A.4) we obtain 



lim min ||x — yUill = 0. 

t— s>oo xeC 

Thus, if we prove Ce„ C Sg for some n, it follows that fit converges to C. 

In order to prove C^^ C Sg we first obtain a certificate which can be used to guarantee 
that a mixed strategy profile belongs to Sg. Then, we show that for large enough n any 
z G Ce„ satisfies this certificate, and hence belongs to Sg. 

(by setting u - 



It follows from Lemma 6.2 



> and V = — minyg;fj^^^ 0(y)) and definition 
of upper semicontinuity (Definition 6.2 ) that there exists 7 > such that 6 neighborhood 
of {x|0(x) > minyeA'A«0(y)} contains {x|0(x) > minyeAfMa 0(y) " 7}- Hence, for any z 
satisfying 0(z) > m.my(zXj^fg 0(y) — 7 there exists some x satisfying 0(x) > miriy^xMs ^ij) 
and ||x — z|| < 6. Note that the definition of Sg implies that z for which there exists such x 
belongs to Sg. Thus, if 0(z) > minyg;fj^^^ 0(y) — 7 it follows that z G Sg. 

We next show that for large enough n, any z which belongs to C^^, satisfies the above 
certificate and hence belongs to Sg. Let L denote the Lipschitz constant for the mixed 
extension of (/>, as given in Lemma |6.1| (i), and define 6' 
Definition 



6.2 



7/L > 0. Lemma 6.2 (ii) and 



imply that for large enough n, Xj^jg.i is contained in 6' neighborhood of Xms, 

n 

i.e., if y G A'^.f^ , 1 then there exists x G Xms such that ||x — y|| < 6' . Moreover, by Lemma 

n 

6.1| (i), it follows that 0(y) > 0(x) — L6' = 0(x) — 7. Thus, we conclude that there exists 



large enough n such that 



min </'(y) > min 0(y) — 7. 



(A.5) 



Let z G Ce„ for some n for which (A.5) holds. By definition of C^ it follows that 0(z) > 

0(y) ~ 7- However, as argued 



^^^y&^Ms+i/u '^(y)- Thus, ( [AlSl implies that 0(z) > minye;t„, 

before such z belong to Sg. Hence, we have established that for large enough n, if z G C^^ 

then z G Sg. Therefore, the claim follows. D 



34 



Proof of Theorem 6.2:. Assume that Q has / equihbria, denoted by Xi, . . . ,x;. Define the 

minimum pairwise distance between the equihbria as d = miuj^j | |xj — x^ 1 1. Let / : ]R_|_ — > IR+ 

be a function such that 

f (a) = max min llx — xtll, (A. 6) 

xeA-c fce{i,...,«} 

for ah a G M+. Note that min^gji^. ^^j ||x — Xfe|| is continuous in x, since it is minimum 
of finitely many continuous functions. Moreover, X^ is a compact set, since e-equihbria are 
defined by finitely many inequality constraints of the form (|9]). Therefore, in (A. 6) maximum 



is achieved and / is well-defined for all a > 0. From the definition of /, it follows that the 
union of closed balls of radius /(a), centered at equilibria, contain a-equilibrium set of the 
game. Thus, intuitively, /(a) captures the size of a closed neighborhood of equilibria, which 



contains a-equilibria of the underlying game. This is illustrated in Figure |A.7 




Figure A. 7: Consider a game with a unique equilibrium x^. The a-equilibrium set of the 
game (enclosed by a solid line around x^) is contained in the /(a) neighborhood of this 
equilibrium. 



Let a > be such that /(a) < (i/4, i.e., every a-equilibrium is at most d/A distant 



from an equilibrium of a game. Lemma 6.2 (ii) implies (using upper semicontinuity at 



a = 0) that such a exists. Since d is defined as the minimum pairwise distance between 
the equilibria, it follows that a-equilibria of the game are contained in disjoint /(a) < d/A 
neighborhoods around equilibria of the game, i.e., if x e Xa, then ||x — x^H < /(a) for exactly 
one equilibrium x^. Moreover, for ai < a, since Xa^ <Z Xa,ii follows that Oi-equilibria of the 
game are contained in disjoint neighborhoods of equilibria. 

We prove the theorem in 5 steps summarized below. First two steps explore the properties 
of function /, and define 5 and e presented in the theorem statement. Last three steps are the 
main steps of the proof, where we establish convergence of fictitious play to a neighborhood 
of equilibria. 

• Step 1: We first show that / is (i) weakly increasing, (ii) upper semicontinuous, and 
it satisfies (iii) /(O) = 0, (iv) /(x) — )■ as x — )■ 0. 

• Step 2: We show that there exists some 5 > and e > such that the following 
inequalities hold: 

M5 + e<a, (A.7) 
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and 



f{MS + e)< 



MS)d 



(A. 



24LM 
We will prove the statement of the theorem assuming that < 6 < 6, and establish 



convergence to the set in (16), for any e such that < e < e. As can be seen from the 



definition of a and / (see (A. 6)), the first inequality guarantees that M(5 + e-equilibrium 
set is contained in disjoint neighborhoods of equilibria, and the second one guarantees 
that these neighborhoods are small. In Step 4, we will exploit this observation, and use 



the inequalities in (A. 7) and (A. 8) to establish that the empirical frequency distribution 



yUt can visit the component of A'^/^+e contained in the neighborhood of only a single 
equilibrium infinitely often. 

Step 3: Let ei,e2 be such that 62 > ei > 0. Assume that (i) at some time instant T, 
lit is contained in XMS+ei, (h) at time instants Ti and T2 (such that T2 > Ti > T) /i^ 
leaves XMs+ei and XM5+e2 respectively and (iii) at time instants T^ and T[ (such that 



T[ > T2 > T2) fj-t returns back to XMS+e2 and XMS+ei respectively. In Figure A.8[ the 
path /ii follows between Ti and T[ is illustrated. 

In this step, we provide a lower bound on 4>{fiT') — 0(/^ri), i-e., the increase in the 
potential when fit follows such a path. This lower bound holds for any ei and €2 
provided that €2 > Ci > 0. We use this result by choosing different values for ei and €2 
in Steps 4 and 5. 

Our lower bound in Step 3 is a function of €2- In addition to this lower bound, in Steps 
4 and 5, we use the M6 + ei equilibrium set and Lipschitz continuity of the potential 
to provide an upper bound on 4>{fiT') — 0(/^ri) as a function of ei. Thus, properties 
of M6 + ei and M6 + €2 equilibrium sets are exploited for obtaining upper and lower 
bounds on 4>{fiT') — 0(/^Ti) respectively. We establish convergence of fictitious play 
updates to a neighborhood of an equilibrium by using these bounds together in Steps 
4 and 5. We emphasize that allowing for two different approximate equilibrium sets 
leads to better bounds on 4>{fiT') — 0(/^Ti), and a more informative characterization 
of the limiting behavior of fictitious play, as opposed to using a single approximate 
equilibrium set, i.e., setting ei = €2- 

Step 4-' Our objective in this step is to establish that fictitious play can visit the 
component of an approximate equilibrium set contained in the neighborhood of only 
one equilibrium infinitely often. 



Let ei = e and €2 = a — M6. By (A. 7) we have ei < €2, and using the definition of a we 
establish that Xms+^i and XM5+e2 are contained in disjoint neighborhoods of equilibria. 
Assume that fit leaves the components of Xms+^i and XMs+e2 i^i the neighborhood of 
equilibrium x^, and reaches to a similar neighborhood around equilibrium x^/. Using 
Step 3 we establish a lower bound on the increase in the potential when fit follows 
such a trajectory. We also provide an upper bound, using the Lipschitz continuity of 



the potential and inequalities (A. 7) and (A. 8). Comparing these bounds, we establish 



that the maximum potential in the neighborhood of equilibrium x^ is lower than the 
minimum potential in the neighborhood of x^/. Since, x^ and Xfc' are arbitrary, this 
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observation implies that fit cannot visit the component of Xms+ei contained in the 
neighborhood of x^ at a later time instant. Hence, it follows that fit visits only one 
such component infinitely often. 

Step 5: In this step we show that fit converges to the approximate equilibrium set given 
in the theorem statement. 

Let ei,e2 be such that < ei < e2 < e. We consider the equilibrium, whose neighbor- 
hood is visited infinitely often (as obtained in Step 4), and a trajectory of fit which 
leaves the components of XMS+ei and XMS+e2 contained in the neighborhood of this 
equilibrium and returns back to these sets at a later time instant (as illustrated in 



Figure A.8). As in Step 4, Lipschitz continuity of is used to obtain an upper bound 
on the increase in the potential between the end points of this trajectory. Together 
with the lower bound obtained in Step 3, this provides a bound on how far fit can get 
from the component of XM5+t2 contained in this neighborhood. Choosing ei arbitrarily 
small (for a fixed €2) we obtain the tightest such bound. Using this result, we quan- 
tify how far fit can get from the equilibria of the game (after sufficient time) and the 
theorem follows. 

Next we prove each of these steps. 

Step 1:. By definition X^^ C X^ for any ai < a. Since the feasible set of the maximization 



problem in (A. 6) is given by X^, this implies that /(ai) < /(a), i.e., / is a weakly increasing 



function of its argument. Note that the feasible set of the maximization problem in (A. 6) 



can be given by the correspondence g{Q.) = X^, which is upper semi continuous in a as 



shown in Lemma 6.2 (ii). Since as a function of x, im'D.k£{i^...^i} ||x — Xjt|| is continuous it 



follows from Berge's maximum theorem (see Berge (1963)) that for a > 0, /(«) is an upper 
semicontinuous function. 

The set Xq corresponds to the set of equilibria of the game, hence Xq = {xi, . . . ,Xi}. 
Thus, the definition of / implies that /(O) = 0. Moreover, upper semicontinuity of / implies 
that for any e > 0, there exists some neighborhood V of 0, such that f{x) < e for all x eV . 
Since, f{x) > by definition, this implies that lima;_j.o /(x) exists and equals to 0. 



Step 2:. Let 5 > be small enough such that M5 < a/2. Since lima-^o fi.^) = 0, it follows 
that for sufficiently small 6 and e, we obtain f{M6 + e) < 4^^ < ■mlm ^^^ ^^ + e < a. 

Step 3:. Let ei, £2 be such that < ci < £2. Assume T > is large enough so that for t > T, 
(pifit+i) - 0(/it) > 3^xTT if /it ^ XM5+ei, aud similarly 



0(/it+l) - 0(Mt) > ^(^ if /it i ^MS+e 



Existence of T satisfying these inequalities follows from Lemma 6.3, since for large T and 
t > T, this lemma implies 4>{fit+i) - 4>{fJ't) > j^ + O {^) > ^^^ if fit ^ Xms+e-,, and 
similarly if fit ^ XM5+e2 ■ 



Since (j){fit) increases outside M5+ei-equilibrium set for t > T, as (A.9) suggests, it follows 
that fit visits Xms+ei (and XM5+e2 since Xms+ei C XMS+e2) infinitely often. Otherwise 4>{fit) 
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increases unboundedly, and we reach a contradiction since mixed extension of the potential 
is a bounded function. 

Assume that at some time after T, /xt leaves Xms+ei and XM5+t2 and returns back to 
Xms+^x at a later time instant. In this step, we quantify how much the potential increases 
when /if follows such a path. We first define time instants Ti, T2, T{, and T2 satisfying 
T <Ti<T2<T^<T[, as follows: 

• Ti is a time instant when fit leaves XMS+ei, i-e., /iTi-i G Xms+ei and fit ^ A:'M5+ei for 
Ti < t < T[. 

• T2 is a time instant when fit leaves XM5+e2y i-e-) /^T2-i ^ '^Af<5+e2 ^'^d /it ^ XMS+e2 for 
T2 < t < T^. 

• T2 is the first time instant after T2 when /i^ returns back to XMS+t2, i-e-, A'-t^-i ^ '^A/<5+e2 
and /iTv G A'M^+ea- 

• T{ is the first time instant after Ti when fit returns back to Xms+ei, i-e., fir'-i ^ ^MS+ei 
and /iT{ G '^M^+ei - 



The definitions are illustrated in Figure |A.8[ We next provide a lower bound on the quantity 
4'{t^T[) — 4>{f^Ti)- Note that if there are multiple time instants between Ti and T[ for which 
fit leaves XM5+e2 (as in the figure), any of these time instants can be chosen as T2 to obtain 
a lower bound. 



^M5+e2 




Figure A. 8: Trajectory of fit (initialized at the left end of the dashed line) is illustrated. Ti 
and T2 correspond to the time instants fit leaves Xms+ei and XM5+e2 respectively. T[ and Tg 
correspond to the time instants fit enters Xus+e^ and XMS+e2 respectively. 



By definition, for t such that T2 < t < Tg, we have fit ^ X^ 



M5+e2 5 



and for t such that 



Ti < t < T2 or T2 < t < T[, we have fit ^ Xms+ci- Thus, it follows from (A. 9) that 



(pifit+i) - 4>{fit) > 



262 

3(t+l) 



for T2<t<n 



25 



(A.IO) 
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and consequently, 



TL-l 



T. -1 



0(/^T^) - ^(/^Ts) = XI <^(/^t+l) - <^(/^t) ^ 5Z 



269 



t=T2 



t=T2 



3{t+l)' 



(A.ll) 



Similarly, since fxt ^ Xms+ei for t such that Ti < t < T2 or T2 < t < T{, using (A. 9) we 
establish 

t=T' t=T' ^ ^ 



T2-I 



T2-I 



0(/iT2) - 0(/XTi) = XI <^(/^*+l) - <^(/^t) ^ X 



2ei 



t=Ti 



t=Ti 



3{t+l)' 



(A.13) 



Since 0(/xt,O -0(/fTi) = (0(/fT;) - 0(/^r^)) + (0(/^t^) - ^(^Ta)) + (^Wa) - 0(/iTi)), it follows 
from dAltl), (|Al2l) and dAlsl) that 



(j){flTi) -0(/iTi) > X 



T'-l 

2e2 



t=T2 



3(t + l)' 



(A.14) 



Step J^:. Let 62 = a — M(5, and e\ = e. By definition of e and 5 (see Step 2), it follows that 
62 > ci > 0. Assume that 5 < 5. Since a = M6 + 62 > M6 + €2 > M6 + ei we obtain 
Xji-fs+ei C. XM5+e2 C Xa- By definition of a, Xa is contained in disjoint neighborhoods of 
equilibria. Thus, it follows that components of A'^^+g^ and Xnjs+t2 ^^e also contained in 
disjoint neighborhoods of equilibria. Hence, the definition of / suggests that if x G Xms+ei 
then ||xfc — x|| < f{M6 + ei) (similarly if x G XMS+t2 , then ||xfc — x|| < f{M6 + £2)) for 
exactly one equilibrium x^. 

Let Ti, T2, T[ and Tj be defined as in Step 3. In this step, by obtaining an upper bound on 
0(/ir') — 0(Atri) and refining the lower bound obtained in Step 3 for given values of ei and €2, 
we prove that after sufficient time /ij can visit the component of Xms+ei in the neighborhood 
of a single equilibrium. 

Assume that fit leaves the component of the M6 + ei-equilibrium set in the neighborhood 
of equilibrium x^, and it reaches to another component in the neighborhood of equilibrium 
Xfc/. Since, by definition /iTi-i^/^T' ^ Xms+ei, and fiT2-i, fJ^x' ^ '^M5+e2! it follows that 
fiTi-i and fiT2-i belong to neighborhoods of equilibrium x^, whereas, fix' and fix' belong to 
neighborhoods of x^/, i.e.. 



|xfc -/iTi-i|| < /(M(5 + ei) and 
I |xfc' — /Ut' 1 1 < f{M6 + ei) and 



|xfc — /U.T2-1II < /(M(5 + £2), whereas, (A. 15) 
|xfc'-/iT'||</(M(5 + e2). (A.16) 



By definition of d we have | |xfc — x^/ 1 1 > c?. Since a > M6 + e2, it follows that f{M6 + e2) < 



/(a) < d/A, and hence the second inequalities in (A. 15) and (A.16) imply 



I M d 



(A.17) 
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Using this inequality, we next refine the lower bound on 4>{fiT') — 'Pi.l^Ti) obtained in 
Step 3. By (10), with an update at time t, the empirical frequency distribution can change 

by at most 

1 ,, .,, 1 .,,. 2M 



ll/i 



t+i 



m 



m 



< 



-{\W\ + \\h 



< 



(A.18) 



where the last inequality follows from the fact that fit = {yuj^jmex, and It = {/™}meA^) and 
ll/^ni) ll-^ni — 1) since II", fi^ G AE"^. Hence, if T2 is sufficiently large, then \\fJ.T2 ~/^T2-i|| 
is small enough so that (A. 17) implies W/j^t^ — /^TaH > f- Using this together with (A.18), we 
conclude 



T^-l 



Ti-1 

^ 2M v^ , , 



t=T2 



t + l 



t=T2 



t+l ~ A^t 1 1 ^ II /_^ f^t+i ~ fJ't I 

t=T2 



l/^T' -/^Tall > 



(A.19) 



Thus, the lower bound on (f){fiT') ~ 0(/^Ti) provided in (A. 14) takes the following form: 



(j){fiT{) -0(/iTi) > Yl 



2e2 



t=T2 



3(t+l) 



> 



6M' 



(A.20) 



Next we provide an upper bound on (p{^T') ~0(/^Ti), using Lipschitz continuity of the po- 
tential and the properties of the M6+ei equilibrium set. Let 0^ = maxjx | ||x-Xfe||</(A/<5+ei)} 0(x) 
and define yk as a strategy profile which achieves this maximum. Similarly, let = 
miujx I ||x-x^,||</(M<5+ei)} 0(x) and define yk' as a strategy profile which achieves this min- 
imum. Observe that 



(A.21: 



^k' ~^k = 0(yfc') - 0(yfc) 

= (0(yfc') - <P{f^Ti)) + (0(^T{) - 0(/iTi)) + (0(/^Ti) - 0(yfc)) • 

Note that by ( A.15[ ) and (A. 16), and the definitions of y^ and y^', we have f^T'^Yk' £ 
{x I ||x-Xfc/|| < f{M5 + ei)}, and /iri-i,yfc e {x | ||x-Xfc|| < f{M5 + ei)}. Hence, 
using Lipschitz continuity of (and denoting the Lipschitz constant by L) it follows that 
0(yfcO - (l)ifiTi) > -2Lf{M5 + ei), and 0(;Ut,_i) - 0(yfc) > -2Lf{M6 + ei). Moreover, 

(A.18) and Lipschitz continuity of imply that 0(/iTi) — (f^if^Ti-i) = O ii^r]. Thus, using 

(A. 21) we obtain the following upper bound on (p{fiT') — 0(/^Ti): 



^-k' 



., + 4Lf{M6 + ei) + O ( ^ ) > 0(/iT^) - 0(/iTj. 



(A.22) 



Using the lower and upper bounds we obtained in ( |A.20[ ) and ( |A.22 ), it follows that 

(A.23) 



,,-0, + 4L/(M5 + 6O + O(lj>|^. 



Since €2 = a — M6, and ei = e, using the fact that / is an increasing function and 6 < S 

(a - M5)d 



it follows from (A.23) that 



6M •'^ ' V Ti 



> 



6M 



ALf{MS + e) + 



Ti 
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Note that (A. 8) implies ^° g^^ ' — ALf{M5 + t) > 0. Thus, for sufficiently large Ti we obtain 



— 0^, > 0. Therefore, we conclude when fit leaves the component of XMS+ei contained 
in the neighborhood of some equilibrium x^, and enters that of another equilibrium x^/, 
then the minimum potential in the new neighborhood is strictly larger than the maximum 
potential in the older one (for sufficiently large Ti). Since this is true for arbitrary equilibria 
Xfc and Xfc', it follows that after entering the component of Xms+ei in the neighborhood of 
Xfc/, fit cannot return to the component in the neighborhood of x^, as doing so contradicts 
with the relation between the minimum and maximum potentials in these neighborhoods. 
Thus, after sufficient time, fit can visit the component of ^Ms+ei (or equivalently Xms+i) in 
the neighborhood of a single equilibrium. 

Step 5:. Let ei, and 62 be such that < ei < 62 < e. As established in Step 4, there exists 
some T, such that for t > T, fit visits the component of Xms+e, in the neighborhood of a 
single equilibrium, say x^. 

Assume that Ti,T2,T[ and Tg are defined as in Step 3, and let Ti > T + 1. Since 
ei < €2 < e, we have XMS+ei C XM5+t2 C Xms+i, and Ti > T + 1 implies that fit can only 
visit the components of Xms+ei and XMS+e2 contained in the neighborhood of x^. Following 
a similar approach to Step 4, we next obtain upper and lower bounds on cp^fiT') — (p{fiTi) , and 
use these bounds to establish convergence to the mixed equilibrium set given in the theorem 
statement. 

Define d* as the maximum distance of fit from XMS+e2 for t such that T + l<T2<t< 
T2 - 1, i.e., 

d* = max min ||u( — xll. 

Since /iT2-i,/^T^ ^ '^M5+e2 by definition, the total length of the trajectory between T2 — 1 
and T2 is an upper bound on 2d*, i.e., 

2d* < ^ \\fit+i -/ii||. 

t=T2-l 



As explained in (A. 18), \\fit+i — l^t\\ < |^, thus the above inequality implies 



,, ^ 2M ^ 2M 2M ,, ,, 

2d* < > = > + . A.24 

t=T2-l t=T2 ^ 



Using this inequality, the lower bound in (A. 14) implies 



We next obtain an upper bound on (j){fiT')—(p{fiTi)- By definition of/, X^s+ti is contained 
in f{M6 + ei) neighborhoods of equilibria. For Ti > T + 1, fit can only visit the component 
of ^MS+ti in the neighborhood of x^, as can be seen from the definition of T. Thus, since 
fiTi^i,fiT[ £ '^MS+ei, it follows that fiTi-i, fiT{ ^ {x | ||x — Xfc|| < /(M(5 + ei)}. By Lipschitz 
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continuity of the potential function it follows that (pifJ^r') ~ 0(/^Ti-i) ^ 2f{MS + ei)L. 



2ML 



Additionally, by (A. 18) Lipschitz continuity also implies that 0(/iTi) — 4'i.^^Tx-i) ^ ^^ 
Combining these we obtain the following upper bound on (p{fiT') — (Pif^Ti)'- 



<f){fiTi) - Hi^tJ < 2f{M6 + ei)L + 



2ML 



(A.26) 



It follows from the upper and lower bounds on (p^iir') — (f'if^Ti) given in (A. 25) and (A.26) 
that 

Thus, for sufficiently large Ti (and hence T2), we obtain 



3f{M6 + ei)ML 2>M^L M Af\M5 + ei)ML 

^ _ ~r 7^ ~r r-r-i _ • 



(A.27) 



^2 ^2^1 T2 62 

Note that in the above derivation ei is an arbitrary number that satisfies < ci < 62. 



Thus, (A.27) implies that 



Af{M5 + e,)ML Af{M5)ML 
a < limsup < , 

ei-s>0 £2 ^2 



(A.28) 



where the last inequality follows by upper semicontinuity of /. Thus, by definition of d* , we 



conclude that Ht converges d* neighborhood of XM&+e2- Hence, using (A.28), we can establish 
convergence of [it to 



Af{M6)ML 
|x - y|| < , for some y G XM5+e2 

^2 



(A.29) 



Observe that definition of / implies if y G XM&+e2i then for some equilibrium x^ we have 



||xfc — y|| < f{,M5 + 62). Thus, using (A.29) and triangle inequality, we conclude that ^it 
converges to 



|x — Xfell < h f{M6 + 62), for some equilibrium x^ 

£2 



(A.30) 



Noting that in (A.30 ) 62 is an arbitrary number satisfying < £2 < e, the theorem follows. D 
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